Detect & delete unreferenced code with ArchUnit

When you maintain a large Java project for a longer period, the moments where you’re finally able to remove unused code can be very satisfying. No more upkeep, library version migrations or dark corners to maintain, for code that’s no longer being used. But finding out which parts of the code base can be removed can be a challenge, and tooling in this space seems not to have kept pace with recent development practices in Java. In this post we’ll outline an approach to find unreferenced code with ArchUnit, which allows you to iteratively detect & delete unused code from your Java projects.

ArchUnit

ArchUnit is a free, simple and extensible library for checking the architecture of your Java code using any plain Java unit test framework. That is, ArchUnit can check dependencies between packages and classes, layers and slices, check for cyclic dependencies and more. It does so by analyzing given Java bytecode, importing all classes into a Java code structure.
— Archunit website

ArchUnit itself has been covered before by Niels, who outlined its usefulness in guarding the architectural conventions on a project.

In this post we’ll expand on the custom rules mentioned briefly before, as well as adopt the new ArchUnit JUnit5 test style and lambdas. We’ll utilize ArchUnit’s graph of dependencies between classes and methods to detect unreferenced classes and methods, which are our candidates for deletion. We’ll also explore both generic and project specific ways to prevent false positives from polluting the results.

GitHub repository & Test structure

At time of writing we’re using ArchUnit 0.23.0. Specifically, we’re using the JUnit5 support through com.tngtech.archunit:archunit-junit5:0.23.0. That brings the basic test structure outline to:

@AnalyzeClasses(
  packagesOf = ArchunitUnusedRuleApplication.class,
  importOptions = {
    ImportOption.DoNotIncludeArchives.class,
    ImportOption.DoNotIncludeJars.class,
    ImportOption.DoNotIncludeTests.class
})
class ArchunitUnusedRuleApplicationTests {
  @ArchTest
  static ArchRule classesShouldNotBeUnused = classes()
    .that()
      ...
    .should(
      ...
    );
  @ArchTest
  static ArchRule methodsShouldNotBeUnused = methods()
    .that()
      ...
    .should(
      ...
    );
}

Notice how we’re limiting classes analyzed through the use of packagesOf and importOptions arguments on @AnalyzeClasses(). Also, the use of @ArchTest on a static ArchRule field alleviates us from having to call ArchRule.check(JavaClasses) as shown in the earlier blogpost.

With the classes() and methods() selectors from ArchRuleDefinition we select what elements we want to check with our rule. Typically these elements are then further restricted by chaining calls after .that(), to weed out potential false positives. Finally, with .should() we check that all remaining classes satisfy our given condition, and raise an exception when any are found.

Detecting unused methods

With the above basic structure in place, we can start to define our first rule to detect unused methods. As we’ve said before ArchUnit builds up a graph of dependencies between Java code units, which we can use to find code units that are never referenced from other Java code units.

Of course, in a typical Spring Boot application there are legitimate reasons why a method is never invoked directly, in particular when annotated to be a web endpoint, message listener or command-, event- or exception handler. In these instances the framework will invoke the methods for you, so you would not want these accidentally marked as unreferenced in your test rule. The same goes for common methods added in particular when using Lombok’s @Data or @Value, which add equals, hashCode and toString methods to classes.

Putting all of these constraints together we arrive at the following ArchRule to find unused methods.

private static Predicate<JavaMethod> hasMatchingNameAndParameters(JavaMethod input) {
  return m -> m.getName().equals(input.getName())
      && m.getRawParameterTypes().size() == input.getRawParameterTypes().size()
      && (m.getDescriptor().equals(input.getDescriptor())
          || namesOf(m.getRawParameterTypes()).containsAll(namesOf(input.getRawParameterTypes())));
}

private static DescribedPredicate<JavaMethod> methodHasAnnotationThatEndsWith(String suffix) {
  return describe(String.format("has annotation that ends with '%s'", suffix),
      method -> method.getAnnotations().stream()
          .anyMatch(annotation -> annotation.getRawType().getFullName().endsWith(suffix)));
}

private static ArchCondition<JavaMethod> shouldBeReferencedMethod = new ArchCondition<>("not be unreferenced") {
  @Override
  public void check(JavaMethod javaMethod, ConditionEvents events) {
    Set<JavaCodeUnitAccess<?>> accesses = new HashSet<>(javaMethod.getAccessesToSelf());
    accesses.removeAll(javaMethod.getAccessesFromSelf());
    if (accesses.isEmpty()) {
      events.add(new SimpleConditionEvent(javaMethod, false, String.format("%s is unreferenced in %s",
          javaMethod.getDescription(), javaMethod.getSourceCodeLocation())));
    }
  }
};

private static ArchRule methodsShouldNotBeUnused = methods()
    .that(describe("are not declared in super type", input -> !input.getOwner().getAllRawSuperclasses().stream()
        .flatMap(c -> c.getMethods().stream()).anyMatch(hasMatchingNameAndParameters(input))))
    .and(describe("are not declared in interface", input -> !input.getOwner().getAllRawInterfaces().stream()
        .flatMap(i -> i.getMethods().stream()).anyMatch(hasMatchingNameAndParameters(input))))
    .and().doNotHaveName("main")
    .and().areNotMetaAnnotatedWith(RequestMapping.class)
    .and(not(methodHasAnnotationThatEndsWith("Handler")
        .or(methodHasAnnotationThatEndsWith("Listener"))
        .or(methodHasAnnotationThatEndsWith("Scheduled"))
        .and(declaredIn(describe("component", clazz -> clazz.isMetaAnnotatedWith(Component.class))))))
    .should(shouldBeReferencedMethod)
    .as("should use all methods")
    .because("unused methods should be removed");

@ArchTest
public static ArchRule methodsShouldNotBeUnusedFrozen = freeze(methodsShouldNotBeUnused);

Detecting unused classes

To detect entire classes that are unreferenced from other classes, we can apply the same approach with a few minor tweaks. We would not want to falsely identify any @Component that contains a web endpoint, listener or handler either, so we need yet another custom predicate. In our condition check we also check whether JavaClass.getDirectDependenciesToSelf() turns up any dependencies, to weed out yet another source of false positives.

Eventually, we end up with the following ArchRule to find unused classes.

private static DescribedPredicate<JavaClass> classHasMethodWithAnnotationThatEndsWith(String suffix) {
  return describe(String.format("has method with annotation that ends with '%s'", suffix),
      clazz -> clazz.getMethods().stream()
          .flatMap(method -> method.getAnnotations().stream())
          .anyMatch(annotation -> annotation.getRawType().getFullName().endsWith(suffix)));
}

private static ArchCondition<JavaClass> shouldBeReferencedClass = new ArchCondition<>("not be unreferenced") {
  @Override
  public void check(JavaClass javaClass, ConditionEvents events) {
    Set<JavaAccess<?>> accesses = new HashSet<>(javaClass.getAccessesToSelf());
    accesses.removeAll(javaClass.getAccessesFromSelf());
    if (accesses.isEmpty() && javaClass.getDirectDependenciesToSelf().isEmpty()) {
      events.add(new SimpleConditionEvent(javaClass, false, String.format("%s is unreferenced in %s",
          javaClass.getDescription(), javaClass.getSourceCodeLocation())));
    }
  }
};

private static ArchRule classesShouldNotBeUnused = classes()
    .that().areNotMetaAnnotatedWith(org.springframework.context.annotation.Configuration.class)
    .and().areNotMetaAnnotatedWith(org.springframework.stereotype.Controller.class)
    .and(not(classHasMethodWithAnnotationThatEndsWith("Handler")
        .or(classHasMethodWithAnnotationThatEndsWith("Listener"))
        .or(classHasMethodWithAnnotationThatEndsWith("Scheduled"))
        .or(describe("implements interface", clazz -> !clazz.getAllRawInterfaces().isEmpty()))
        .or(describe("extends class", clazz -> 1 < clazz.getAllRawSuperclasses().size()))
        .and(metaAnnotatedWith(Component.class))))
    .should(shouldBeReferencedClass)
    .as("should use all classes")
    .because("unused classes should be removed");

@ArchTest
public static ArchRule classesShouldNotBeUnusedFrozen = freeze(classesShouldNotBeUnused);

Limitations

Now, while the above rules are a great start off point to identify potentially unused code, unfortunately, it’s also where we will also run into some limitations of ArchUnit. Since ArchUnit operates on the byte code, you might find String constants are inlined at compile time. This may lead to false positives which are marked as unused, whereas in reality they are still used in the source code.

Freezing false (or true!) positives

Fortunately there’s an elegant way to handle false positives with regards to our custom ArchConditions: Freezing Arch Rules. By passing our ArchRule into FreezingArchRule.freeze(ArchRule) we can record all current violations, and stop new violations from being added.

When rules are introduced in grown projects, there are often hundreds or even thousands of violations, way too many to fix immediately. The only way to tackle such extensive violations is to establish an iterative approach, which prevents the code base from further deterioration. FreezingArchRule can help in these scenarios by recording all existing violations to a ViolationStore. Consecutive runs will then only report new violations and ignore known violations. If violations are fixed, FreezingArchRule will automatically reduce the known stored violations to prevent any regression.
— Archunit website

If you notice any generic patterns in the violations it is of course preferable to exclude such classes from analysis with a .that() predicate. For specific violations however, freezing is a great approach to acknowledge their existence in the code base without polluting the generic rule.

Test ArchUnit rules themselves

Finally, you’ll want to ensure the rules you create actually find violations when present. For this you can setup unit tests which import classes specifically crafted to contain a violation, and assert the violation is reported. This step is of course optional, but recommended especially when sharing rules across multiple projects. A sample test might look like this.

@Nested
class VerifyRulesThemselves {
  @Test
  void verifyClassesShouldNotBeUnused() {
     JavaClasses javaClasses = new ClassFileImporter()
       .withImportOption(ImportOption.Predefined.DO_NOT_INCLUDE_ARCHIVES)
       .withImportOption(ImportOption.Predefined.DO_NOT_INCLUDE_JARS)
       .withImportOption(ImportOption.Predefined.DO_NOT_INCLUDE_TESTS)
       .importPackagesOf(ArchunitUnusedRuleApplication.class);
     AssertionError error = assertThrows(AssertionError.class,
       () -> classesShouldNotBeUnused.check(javaClasses));
     assertEquals("""
       Architecture Violation [Priority: MEDIUM] - Rule 'should use all classes, because unused classes should be removed' was violated (3 times):
       Class <com.github.timtebeek.archunit.ComponentD> is unreferenced in (ArchunitUnusedRuleApplication.java:0)
       Class <com.github.timtebeek.archunit.ModelF> is unreferenced in (ArchunitUnusedRuleApplication.java:0)
       Class <com.github.timtebeek.archunit.PathsE> is unreferenced in (ArchunitUnusedRuleApplication.java:0)""",
       error.getMessage());
  }

  @Test
  void verifyMethodsShouldNotBeUnused() {
    JavaClasses javaClasses = new ClassFileImporter()
      .withImportOption(ImportOption.Predefined.DO_NOT_INCLUDE_ARCHIVES)
      .withImportOption(ImportOption.Predefined.DO_NOT_INCLUDE_JARS)
      .withImportOption(ImportOption.Predefined.DO_NOT_INCLUDE_TESTS)
      .importPackagesOf(ArchunitUnusedRuleApplication.class);
    AssertionError error = assertThrows(AssertionError.class,
      () -> methodsShouldNotBeUnused.check(javaClasses));
    assertEquals("""
      Architecture Violation [Priority: MEDIUM] - Rule 'should use all methods, because unused methods should be removed' was violated (2 times):
      Method <com.github.timtebeek.archunit.ComponentD.doSomething(com.github.timtebeek.archunit.ModelD)> is unreferenced in (ArchunitUnusedRuleApplication.java:102)
      Method <com.github.timtebeek.archunit.ModelF.toUpper()> is unreferenced in (ArchunitUnusedRuleApplication.java:143)""",
      error.getMessage());
  }
}

Conclusion

With the above rules in place you can be sure new code changes won’t inadvertently leave any new or old code unreferenced. Any changes to what was previously, or is now unreferenced will be maintained in the freeze store right inside the repository. Together these rules will help keep your code base no larger than it needs to be, allowing you to focus on what’s actually used. Now you start to iteratively detect & delete unused code, and see what pops up next when removed endpoints, methods and classes no longer reference their respective dependencies.