/XsdAsmFaster

XsdAsmFaster is an alternative to XsdAsm, which skips the usage of a data strucuture to store Element information in order to improve performance.

Primary LanguageJavaMIT LicenseMIT

Maven Central Build Coverage Vulnerabilities Bugs

XsdAsmFaster

XsdAsmFaster is a library dedicated to generate a fluent java DSL based on a XSD file. It uses XsdParser library to parse the XSD file into a list of Java classes that XsdAsmFaster will use in order to obtain the information needed to generate the correspondent classes. In order to generate classes this library uses the ASM library, which is a library that provides a Java interface to perform bytecode manipulation, which allows for the creation of classes, methods, etc.

This library main objective is to generate a fluent Java DSL based on an existing XSD DSL. It aims to verify the largest number of the restrictions defined in the XSD DSL. It uses the Java compiler to perform most validations and in some cases where such isn't possible it performs run time validations, throwing exceptions if the rules of the used language are violated.

This library was based on the previously created XsdAsm, with the intent of performing multiple performance improvements on the generated DSLs.

Installation

First, in order to include it to your Maven project, simply add this dependency:

<dependency>
    <groupId>com.github.xmlet</groupId>
    <artifactId>xsdAsmFaster</artifactId>
    <version>1.0.10</version>
</dependency>

How does XsdAsmFaster works?

The XSD language uses two main types of values: elements and attributes. Elements are complex value types, they can have attributes and contain other elements. Attributes are defined by a type and a value, which can have restrictions. With that in mind XsdAsmFaster created a common set of classes that supports every generated DSL as shown below:


The Element interface serve as a base to all element classes that will be generated in any generated DSL, such as the Html class used as an example. There are multiple differences between this version and XsdAsm, the most notable differences regarding infrastructure is the removal of AbstractElement, Attribute and BaseAttribute for performance improvements. Even the attribute classes such as AttrManifest are only created if they have any kind of restrictions associated with them, other than enumeration exceptions, which will be explained ahead.

Concrete Usage

XsdAsmFaster provides a XsdAsmMain class that receives two arguments, the first one being the XSD file path and the second one is the name of the DSL to be generated. All the generated DSLs are placed in the same base package, org.xmlet, the difference being the chosen DSL name, for example, if the DSL name is htmlapifaster, the resulting package name is org.xmlet.htmlapifaster.

public class Example{
    void generateApi(String filePath, String apiName){
        XsdAsmMain.main(new String[] {filePath, apiName} );    
    }
}
The generated classes will be written in the target folder of the invoking project. For example, the HtmlApiFaster project invokes the XsdAsmMain, generating all the HtmlApiFaster classes and writing them in the HtmlApiFaster target folder, this way when HtmlApiFaster is used as a dependency those classes appear as normal classes as if they were manually created.

Examples

Using the Html element from the HTML5 specification a simple example will be presented, which can be extrapolated to other elements. Some simplification will be made in this example for easier understanding.

<xs:element name="html">
    <xs:complexType>
        <xs:choice>
            <xs:element ref="body"/>
            <xs:element ref="head"/>
        </xs:choice>
        <xs:attributeGroup ref="commonAttributeGroup" />
        <xs:attribute name="manifest" type="xsd:anyURI" />
    </xs:complexType>
</xs:element>
With this example in mind what classes will need to be generated?

Html Class - A class that represents the Html element, represented in XSD by the xs:element name="html", deriving from AbstractElement.
body and head Methods - Both methods present in the Html class that add Body and Head instances to Html children. This methods are created due to their presence in the xs:choice XSD element.
attrManifest Method - A method present in Html class that adds an instance of the AttrManifestString attribute to the Html attribute list. This method is created because the XSD html element contains a xs:attribute name="manifest" with a xsd:anyURI type, which maps to String in Java.

public class Html implements CommonAttributeGroup {
    protected final Z parent;
    protected final ElementVisitor visitor;

    public Html(ElementVisitor visitor) {
        this.visitor = visitor;
        this.parent = null;
        visitor.visitElementHtml(this);
    }

    public Html(Z parent) {
        this.parent = parent;
        this.visitor = parent.getVisitor();
        this.visitor.visitElementHtml(this);
    }
    
    public final Html attrManifest(String attrManifest) {
          this.visitor.visitAttributeManifest(attrManifest);
          return this;
       }
    
    default Body<T> body() {
        return new Body(this);
    }
    
    default Head<T> head() {
        return new Head(this);
    }
}
Body and Head classes - Classes for both Body and Head elements, created based on their respective XSD xsd:element.

public class Body extends AbstractElement {
    // Contents based on the respective xsd:element name="body"
}
public class Head extends AbstractElement {
    // Contents based on the respective xsd:element name="head"
}
CommonAttributeGroup Interface - An interface with default methods that add the group attributes to the element which implements this interface.

public interface CommonAttributeGroup extends Element {

   //Assuming CommonAttribute is an attribute group with a single 
   //attribute named SomeAttribute with the type String.
   default Html attrSomeAttribute(String attributeValue) {
      this.getVisitor().visitAttributeSomeAttribute(attributeValue);
      return this;
   }
}

Type Arguments

As we've stated previously, the DSLs generated by this project aim to guarantee the validation of the set of rules associated with the language. To achieve this we heavily rely on Java types, as shown above, i.e. the Html class can only contain Body and Head instances as children and attributes such as manifest or any attribute belonging to CommonAttributeGroup. This solves our problem, but since we are using a fluent approach to the generated DSLs another important aspect is to always mantain type information. To guarantee this we use type parameters, also known as generics.

class Example{
    void example(){
        Html<Element> html = new Html<>();
        Body<Html<Element>> body = html.body();
        
        P<Header<Body<Html<Element>>>> p1 = body.header().p();
        P<Div<Body<Html<Element>>>> p2 = body.div().p();
        
        Header<Body<Html<Element>>> header = p1.__();
        Div<Body<Html<Element>>> div = p2.__();
    }        
}
In this example we can see how the type information is mantained. When each element is created it receives the parent type information, which allows to keep the type information even when we navigate to the parent of the current element. A good example of this are both P element instances, p1 and p2. Both share their type, but each one of them have diferent parent information, p1 is a child of an Header instance, while p2 is a child of a Div instance. When the method that navigates to the parent element is called, the __() method, each one returns its respective parent, with the correct type.

Restriction Validation

In the description of any given XSD file there are many restrictions in the way the elements are contained in each other and which attributes are allowed. Reflecting those same restrictions to the Java language we have two ways of ensure those same restrictions, either at runtime or in compile time. This library tries to validate most of the restrictions in compile time, as shown in the example above. But in some restrictions it isn't possible to validate in compile time, an example of this is the following restriction:

<xs:schema>
    <xs:element name="testElement">
        <xs:complexType>
            <xs:attribute name="intList" type="valuelist"/>
        </xs:complexType>
    </xs:element>
    
    <xs:simpleType name="valuelist">
        <xs:restriction>
            <xs:maxLength value="5"/>
            <xs:minLength value="1"/>
        </xs:restriction>
        <xs:list itemType="xsd:int"/>
    </xs:simpleType>
</xs:schema>
In this example we have an element that has an attribute called valueList. This attribute has some restrictions, it is represented by a xsd:list and its element count should be between 1 and 5. Transporting this example to the Java language it will result in the following class:

public class AttrIntList extends BaseAttribute<List<Integer>> {
   public AttrIntList(List<Integer> attrValue) {
      super(attrValue, "intList");
   }
}
But with this solution the xsd:maxLength and xsd:minLength restrictions are ignored. To solve this problem the existing restrictions of any given attribute are hardcoded in the class constructor. This will result in method calls to validation methods, which verify the attribute restrictions whenever an instance is created. If the instances fails any validation the result is an exception thrown by the validation methods.
public class AttrIntList extends BaseAttribute<List<Integer>> {
   public AttrIntList(List<Integer> attrValue) {
      super(attrValue, "intList");
      RestrictionValidator.validateMaxLength(5, attrValue);
      RestrictionValidator.validateMinLength(1, attrValue);
   }
}

Enumerations

In regard to the restrictions there is a special restriction that can be enforced at compile time, the xsd:enumeration. In order to obtain that validation at compile time the XsdAsm library generates Enum classes that contain all the values indicated in the xsd:enumeration tags. In the following example we have an attribute with three possible values: command, checkbox and radio.

<xs:attribute name="type">
    <xs:simpleType>
        <xs:restriction base="xsd:string">
            <xs:enumeration value="command" />
            <xs:enumeration value="checkbox" />
            <xs:enumeration value="radio" />
        </xs:restriction>
    </xs:simpleType>
</xs:attribute>
This results in the creation of an Enum, EnumTypeCommand, as shown below. This means that any attribute that uses this type will receive an instance of EnumTypeCommand instead of receiving a String. This guarantees at compile time that only the allowed set of values are passed to the respective attribute.

public enum EnumTypeCommand {
   COMMAND(String.valueOf("command")),
   CHECKBOX(String.valueOf("checkbox")),
   RADIO(String.valueOf("radio"))
}
public class AttrTypeEnumTypeCommand extends BaseAttribute<String> {
   public AttrTypeEnumTypeCommand(EnumTypeCommand attrValue) {
      super(attrValue.getValue());
   }
}

Visitor

This library also uses the Visitor pattern. Using this pattern allows different uses for the same DSL, given that different Visitors are implemented. Each generated DSL will have one ElementVisitor, this class is an abstract class which contains five main visit methods:

  • visitElement(Element element) - This method is called whenever a class generated based on a XSD xsd:element has its accept method called. By receiving the Element we have access to the element children and attributes.
  • visitAttribute(String attributeName, String attributeValue) - This method is called when an attribute method is called. It received the attribute name and the attribute value.
  • visitParent(Element element) - This method is called when the __() method is invoked, receiving the instance where the method was invoked.
  • visitText(Text text) - This method is called when the text method is invoked.
  • visitComment(Text comment) - This method is called when the comment method is invoked.


Apart from this five methods we have created specific methods for each element class created, e.g. the Html class, as we can see below with the methods visitParentHtml and visitElementHtml. The same strategy is applied to attributes using the manifest attribute as an example with the method visitAttributeManifest. Apart from this methods we define two more: visitOpenDynamic and visitCloseDynamic. These methods serve a simple purpose, they indicate the start and end of dynamic aspects of the generated element tree. With this information we can create a Visitor implementation which takes advantage of this information to create a caching strategy to improve performance even further.
public class ElementVisitor {
    public abstract void visitElement(Element element);
    
    public abstract void visitAttribute(String attributeName, String attributeValue);
    
    public abstract void visitParent(Element element);
    
    public abstract <R> void visitText(Text<? extends Element, R> text);
    
    public abstract <R> void visitComment(Text<? extends Element, R> comment);
    
    public void visitOpenDynamic() { }
    
    public void visitCloseDynamic() { }
    
    public void visitParentHtml(Html element) {
        this.visitParent(element);
    }
    
    public void visitElementHtml(Html element) {
        this.visitElement(element);
    }
    
    public void visitAttributeManifest(String manifest) {
        this.visitAttribute("manifest", manifest);
    }
}

Performance

To improve performance over the previously created XsdAsm we changed the strategy that was being used with the generated DSLs. In XsdAsm used a strategy based on two distinct steps:
  • Create the element tree.
  • Visit the element tree to perform the concrete behaviour defined in the Visitor implementation.
Although this approach worked as intended the approach seemed flawed because there was no need to store information and then iterate it at a later moment. With this in mind this library merged the two steps, which means that the visit method are being called as the element tree is being created. We can see on how this works with the Html class defined above, replicated below.
public class Html implements CommonAttributeGroup {
    protected final Z parent;
    protected final ElementVisitor visitor;

    public Html(ElementVisitor visitor) {
        this.visitor = visitor;
        this.parent = null;
        visitor.visitElementHtml(this);
    }

    public Html(Z parent) {
        this.parent = parent;
        this.visitor = parent.getVisitor();
        this.visitor.visitElementHtml(this);
    }
    
    public final Html attrManifest(String attrManifest) {
          this.visitor.visitAttributeManifest(attrManifest);
          return this;
       }
    
    default Body<T> body() {
        return new Body(this);
    }
    
    default Head<T> head() {
        return new Head(this);
    }
}
As we can see in this class the visit methods are being called in the class constructor. The same happens with other classes such as Body and Head when their respective method are called, i.e. body and head, or with its attributes as shown with the attrManifest method.

This approach greatly improved the performance obtained, because it removed all the overhead of storing and reading data from a data strucuture.

Element Binding

Element binding was something that is also very different from XsdAsm. In this library there isn't a binding process, since every time we create a tree we are effectively creating a new tree and not reusing an already defined tree with different information as it was in XsdAsm. Let's take a look at the example of element binding provided in XsdAsm:

public class XsdAsmBinderExample{
    public void bindExample(){
        Html<Element> root = new Html<>()
            .body()
                .table()
                    .tr()
                        .th()
                            .text("Title")
                        .__()
                    .__()
                    .<List<String>>binder((elem, list) ->
                        list.forEach(tdValue ->
                            elem.tr().td().text(tdValue)
                        )
                    )
                .__()
            .__()
        .__();
    }
 }
In this example a Table instance is created, and a Title is added in the first row as a title header, i.e. th. After defining the table header of the table we can see that we invoke a binder method. This method bounds the Table instance with a function, which defines the behaviour to be performed when this instance receives the information. In XsdAsmFaster this template would be defined in the following way:
class XsdAsmFasterBinding{
    public void exampleMethod(){
        CustomVisitor visitor = new CustomVisitor();
        List<String> tdValues = Arrays.asList("val1", "val2", "val3");

        new Html<>(visitor)
            .body()
                .table()
                    .tr()
                        .th()
                            .text("Title")
                        .__()
                    .__()
                    .of(table ->
                        tdValues.forEach(value ->
                            table
                                .tr()
                                    .td()
                                        .text(value)
                                    .__()
                                .__()
                        )
                    )
                .__()
            .__()
        .__();
    }
}

Code Quality

There are some tests available using the HTML5 schema and the Android layouts schema, you can give a look at that examples and tweak them in order to gain a better understanding of how the class generation works. The tests also cover most of the code, if you are interested in verifying the code quality, vulnerabilities and other various metrics, check the following link:

Sonarcloud Statistics

Final remarks

Some examples presented here are simplified in order to give a better understanding of how this library works.

Changelog

1.0.8

  • Details - Adds CustomElements for HtmlApiFaster/HtmlFlow.

1.0.7

1.0.6

  • Added two new methods to ElementVisitor, visitOpenAsync and visitCloseAsync, to allow asynchronous operations.
  • Added an async method to all Elements.

1.0.5

  • First usable version.