/pen-test-automation

A framework for automating penetration testing using a plugin based architecture

Primary LanguageJavaApache License 2.0Apache-2.0

About

This project contains the Pen Test Automation (PTA) platform—a service that generates commands for supported penetration testing tools.

The goal of this project is to enable automated application security testing via existing security tools. These tools are typically written for human application security testers, and are usually poorly designed for automated testing. By encapsulating the operation of these security tools, we can normalize the execution of the testing, so that another execution platform (such as CQF) can run the testing.

To execute specific application security attacks, certain information is required for a given security tool to operate. If the PTA platform can become aware of what information is needed, it can provide that information using a combination of suspected findings and threat modeling information, and prompt the user for missing information.

Overview

The PTA platform is built around a tool wrapper plugin architecture. The plugin architecture lets new security tools be supported without modifying the PTA platform itself. By creating tool plugins, the PTA platform can become aware of a security tool's capabilities and use them. Using a plugin architecture allows others (e.g., the open source community) to add tools to the PTA platform without our involvement.

A fully implemented wrapper plugin for a security tool does three high-level things:

  1. Advertises Itself to the PTA platform so the tool's capabilities and requirements are known
  2. Generates Commands for the tool that can be executed by the PTA platform
  3. Parses Results from the tool into a meaningful result for the PTA platform and downstream systems

The Pen Test Automation platform in turn can take commands generated by a wrapper plugin (expressed as ToolCommand objects) and facilitate the logic to execute those tools using supported command executors. Upon completing the execution, the PTA platform takes the result and uses the wrapper plugin to parse the results of execution.

Supporting New tools

To support simple integration with CQF, we established the convention that supported tools should take in a file as input and generate a file as output. As a result, adding support to the Pen Test Automation Platform for a new tool generally consists of:

  1. Orchestrating the tool to follow the above convention
  2. Creating a tool wrapper plugin so that the PTA platform can utilize the tool
  3. Installing the tool in the target execution environment

Orchestration Convention

As mentioned, a tool should take a file as input and generate a file as output. To that end, we have written orchestration scripts around two existing tools to match this convention. For example, we created an orchestration script called esm-7 around the sqlmap tool to parse a configuration file, feed those configuration values as the appropriate parameters, and pipe the output to an output file.

The orchestration script also simplifies the execution of sqlmap by setting some default parameters and obtaining others. Similarly, we created an orchestration script called crydra-16 that applies the same principles for the Hydra tool. You may wish to follow the same process if your tool does not already follow this convention.

Tool Wrapper Plugins

The PTA platform uses [Java's ServiceLoader SPI] (https://docs.oracle.com/javase/tutorial/ext/basics/spi.html) introduced in Java 6. Using the ServiceLoader, the PTA can load any class that implements the ToolWrapper interface and properly registers itself in its JAR META-INF/services/<fully.qualified.classname> file.

If you have a security tool that you want to wrap to be used in PTA, the process to do so is fairly straightforward.

The first step to creating a ToolWrapper plugin is to create a class (e.g., MyToolWrapper) that implements the ToolWrapper interface. Your class needs to implement each of the required methods in the interface:

  • Advertise Itself
  • String getToolName(): this method should return the name of the security tool being wrapped.
  • Set<ToolParameter> getToolParameters(): this method should return a Set of ToolParemeter instances that describe the parameters necessary for the security tool being wrapped.
  • Set<String> getSupportedAttacks(): this method should return a Set of String values representing the attacks supported by the security tool being wrapped. While the String is ostensibly intended to be a CAPEC identifier, you can include your own standard attack designations and identifiers as appropriate.
  • Generate Command
  • ToolCommand generateToolCommand(Tta3Attack attack): this method generates a ToolCommand object based on the given Tta3Attack object. The Tta3Attack encapsulates the chosen attack pattern (i.e., CAPEC identifier) so that, if your wrapper supports multiple attack patterns, it can generate the appropriate command for the chosen attack. The Tta3Attack object also encapsulates any available application metadata so that your wrapper can tailor the tool command based on known circumstances (e.g., application architecture, database backend). The ToolCommand object encapsulates the name of the executable to invoke and the contents of the configuration file as input (see above Orchestration Convention regarding the input file convention).
  • Parse Result
  • ToolResult parseAttackResult(Tta3Attack attack, ExecutionResult executionResult): this method generates a ToolResult object based on the given Tta3Attack and ExecutionResult objects. The ExecutionResult encapsulates the standard out, standard error, exit code, and results file (see above Orchestration Convention regarding the output file convention).
  • optional boolean isSuccessfulExit(ExecutionResult executionResult): this method returns whether or not the tool exited successfully based on the given ExecutionResult. Using Default Methods from Java 8, this method by default returns true if the return code in the ExecutionResult is 0. You can override this default behavior if your tool behaves differently.

Note: the intent is to eventually provide a standard catalog of ToolParameter classes that map to properties that are defined or can be derived by the PTA platform; such a catalog would allow tool wrappers to simply return instances of those classes in the Set returned by getToolParameters(). Likewise, the Tta3Attack class returns a Map of String metadata values based on String keys; the intent is to provide a standard catalog of metadata keys that identify standard metadata values that are defined or derived by the PTA platform. You may wish to define your own custom ToolParameter properties and metadata values; if you do so, these values should be documented for users of your plugin.

The second step to creating a ToolWrapper plugin is to create a text file in the META-INF/services/ folder called com.aspectsecurity.astam.tta3.pentest.tools.spi.ToolWrapper. In this file, list the fully qualified class name of your plugin (e.g., com.example.MyToolWrapper). Include this resource file when packaging your plugin as a JAR. This file registers the plugin as a service provider so that it can be located by the ServiceLoader. See the ServiceLoader SPI documentation for more information.

Examples

We have created three tool plugins that can be used for reference:

  • crydra16-tool-spi
  • dummy-tool-spi
  • esm7-tool-spi

The esm7-tool-spi supports CAPEC-7 and CAPEC-66 through the esm-7 attack orchestration script; the crydra16-tool-spi supports CAPEC-16 through the crydra-16 attack orchestration script; the dummy-tool-spi supports a fictional CAPEC-0 and is used strictly for testing purposes.

Command Executors

The Pen Test Automation platform supports executing security tools either locally via command line or through the Siege CQF platform via two command executors:

  • CliExecutor
  • CqfExecutor

The CliExector produces and executes a command on the command line and is suitable for a pen tester using the platform locally for manual pen testing. The CqfExecutor creates and executes a CQF experiment using the CQF REST API and is suitable for organizations looking to execute automated pen tests at an enterprise scale.

Execution from Command Line (CliExecutor)

The PTA platform uses the Apache Commons Exec library to execute tools locally. As a result, in order to run any tools that are wrapped for PTA, the tool must be installed on the system. Specifically, the tools must be available to be run from the user's path. Users should follow the installation and setup instructions for the given security tool and then ensure that the appropriate executables are included on the user's path. In Unix/Linux-based systems, a common strategy is to include a symlink to the tool executable in /usr/local/bin or similar directory on the user's PATH. In Windows systems, a common strategy is to edit the user's Environment Variables and include the directory containing the executable in the PATH variable.

Execution from CQF (CqfExecutor)

The PTA platform uses Siege Technologies' CQF REST API to execute tools on a CQF server. The CQF platform virtualizes a target and runs "experiments" for testing. Experiments can be configured and executed using the REST API developed by Siege. The CQF platform uses design archetypes to create common architectures and design items for common components. Clients to the CQF platform (such as the PTA platform) can then leverage those design archetypes and items in order to create and execute experiments.

For the PTA platform's purposes, we intend to test one attack scenario per CQF experiment. This practice is simplest and ensures isolation and independence of each attack scenario execution (i.e., prevents one attack attempt from affecting the results of another attack attempt). The ability to test from a clean application would be beneficial for application security testing as some attacks have unintended side effects (e.g., stored cross site scripting attacks, SQL injection attacks) that may affect the results of other attacks.

The setup and configuration of CQF (including the design archetypes and items) is outside the scope of the PTA platform and this README. However, at a high level the following Design Archetypes and Design Items must be created:

  • Design Item(s) for Attack Tool(s): a design item must be created each attack tool that could be invoked by the PTA; the tool must be onboarded to the CQF platform.
  • Design Item(s) for Application Component(s): a design item item for each machine (e.g., application server, database server) in the application must be created.
  • Design Archetype for Application System: an archetype that represents the collection of design items required to simulate a testing environment of the target application.

Consult the corresponding CQF documentation for the process to create these design catalog items in the CQF platform. Once these design items have been setup and configured in CQF, any CQF REST clients (such as the PTA platform) can refer to these design items when creating an experiment.

At a high level, the CQF Design Item paradigm is extremely flexible and can be setup in largely arbitrarily ways. Thus, it was necessary to establish several conventions regarding the theory of how the PTA platform and the CQF platform would interact. For the initial integration example, we targeted a simple web application architecture consisting of an application server and a database. Specifically, we used an old vulnerable version of dotCMS - an open source content management system.

The Design Archetype for this dotCMS example consisted of a target "attackee" machine (i.e., the server running the dotCMS application), an "attackee database" (i.e. the server running the mySQL server supporting the dotCMS application), and an "attacker" machine (i.e., the server with the attack tools installed acting as the attacker). By convention, the identifier for this Design Archetype was designated as com.siegetechnologies.cqf.design.item.archetype.sql-injection. The name itself does not limit the experiment to SQL Injection tests - rather it is an artifact of the fact that SQL Injection was the first chosen test case. Note: while this Design Archetype identifier is intended to be replaced with an appropriate convention, the value itself is currently hard-coded in the PTA platform in the CqfExecutor class.

The Design Items in turn are the aforementioned attackee, attackee database, and attacker machines. At a high level, these items will specify the base machine (e.g., Ubuntu) and software required along with some type of automated configuration. By convention, we designated that:

  • the attackee design item would be identified as com.siegetechnologies.cqf.design.item.software.dotCMS. Through configuration in CQF, the design item would be setup as an Ubuntu machine that runs an automated setup script created to install the old vulnerable version of dotCMS on the server. The CQF platform itself would automatically populate the IP address of the associated attackee database (since CQF orchestrates the virtual networking of the architecture) through a parameter expansion process that essentially injects the appropriate value into the configuration at runtime. Future iterations would allow the specification of the IP address of the attackee database to support scenarios where the application database had a known IP address (e.g., was a live running system outside of CQF).
  • the attackee database design item would be identified as com.siegetechnologies.cqf.design.item.database.mysql. Through configuration in CQF, the design item would be setup as an Ubuntu machine that installs mySQL and loads a predetermined schema. Future iterations would allow the specification of a schema as a parameter.
  • the attacker design item would be identified based on the name of the attack orchestration script wrapping the security tool using the prefix com.siegetechnologies.cqf.design.item.software.. For example, we would use com.siegetechnologies.cqf.design.item.software.esm-7 to refer to the ESM-7 attack orchestration script and com.siegetechnologies.cqf.design.item.software.cyrdra-16 to refer to the Crydra-16 orchestration script. These design items are created in CQF to accept an input file as a Base64 encoded blob, and to specify the output file from which the results are retrieved through parameter expansion. This CQF setup paradigm is what led to the Orchestration Convention described earlier where security tools should accept an input file and generate an output file.

Note: As with the Design Architecture, the identifiers for the attackee and attackee database are currently hard-coded. Likewise, the prefix identifier for the attacker (i.e. com.siegetechnologies.cqf.design.item.software.) is also hard-coded. The name of the tool is derived dynamically based on the chosen tool in the PTA platform (which in turn assumes that those corresponding attacker design items are created in CQF using the tool's name as the last part of the identifier).

With the CQF design catalog populated and with the Design Archetype and Design Items, the PTA platform simply uses the CQF REST API to construct an experiment using the specified Design Archetype (i.e. com.siegetechnologies.cqf.design.item.archetype.sql-injection) with the correct Design Items (i.e. com.siegetechnologies.cqf.design.item.software.dotCMS, com.siegetechnologies.cqf.design.item.database.mysql and either com.siegetechnologies.cqf.design.item.software.esm-7 or com.siegetechnologies.cqf.design.item.software.cyrdra-16). Once the experiment is constructed on the CQF platform, the PTA platform can then execute the experiment and await the results. The PTA platform uses the Java API client provided by CQF to invoke the REST API. This Java client generates a JSON object for the defined experiment and sends the JSON object via HTTP request to a CQF REST API endpoint (which the use is prompted for in the PTA platform).

An example JSON object generated by the PTA platform for an attack using ESM-7 is shown below:

{
  "design": {
    "objectKey": "com.siegetechnologies.cqf.design.item.archetype.sql-injection"
  },
  "parameter_bindings": [],
  "children": [
    {
      "object": {
        "design": {
          "objectKey": "com.siegetechnologies.cqf.design.item.software.esm-7"
        },
        "parameter_bindings": [
          {
            "name": "attack_config_file_as_blob",
            "value": "base64Encoded(attack.conf.file)"
          }
        ],
        "children": []
      }
    },
    {
      "object": {
        "objectKey": "com.siegetechnologies.cqf.design.item.software.dotCMS"
      }
    },
    {
      "object": {
        "objectKey": "com.siegetechnologies.cqf.design.item.software.mysql"
      }
    }
  ]
}

The CQF platform executes the attack and returns a result. This result needs to be interpreted by the PTA platform. Future iterations would determine the output and results generated by CQF. In theory, CQF provides similar output to execution from the command line - a CQF result would contain standard out and standard error streams as well as exit codes and generated files from any processes that ran on machines in the experiment. The PTA platform would simply need to parse through these results and identify output belonging to the attacker machine. Once those results were distilled, processing would proceed similar to the CliExecutor.

Putting It All Together

The tta3-orchestrator project represents a concrete implementation of the Pen Test Automation platform. The implementation loads ToolWrapper plugins (via the Java ServiceLoader SPI), loads application metadata (mocked), and allows the user to select a supported attack pattern. Based on the selected attack pattern, an appropriate tool would be chosen based on a heuristic that takes into account the attack and application metadata (mocked). The user is given the ability to derive required tool parameters based on (mocked) application metadata or manually enter these values. The PTA platform then generates a ToolCommand that can be used by a PTA command executor. The user is given a choice to execute the generated ToolCommand in CQF (using the CqfExecutor), or on the command line interface (using the CliExecutor). The PTA platform then previews the command execution by either displaying the experiment JSON that would be generated (if CqfExecutor is chosen) or the command that would be run on CLI (if CliExecutor is chosen). The user can then execute the command. If the user selected the CqfExecutor, the user is prompted for the CQF REST API endpoint where the generated experiment JSON is sent for execution. If the user selected the CliExecutor, the command is executed locally. Once the execution is complete, the raw results of the command are shown and the user is given the opportunity to "process" the results. After processing the result, the user is shown a distilled version of the "finding" if any. See the Getting Started guide for more detail.

Quick Demo

Prerequisites

  1. Java JDK 8 or later
  2. Maven
  3. A properly configured instance of the CQF platform (note the discussion above in Execution from CQF)
  4. The proper local setup of the ESM-7 and/or Crydra-16 orchestration scripts and their dependencies (note the path requirements specified in the section above in Execution from Command Line)
  5. The compiled CQF REST API Java Client JAR available to your local Maven

Instructions

  1. cd [path-to-downloaded-code]/
  2. mvn clean install
  3. cd tta3-orchestrator/
  4. mvn exec:java -P cli-with-plugin

See the Getting Started guide for more detail.