/xsd2html2xml

Generates plain HTML5 forms from XML schemas (XSDs). Transforms filled-in forms into XML.

Primary LanguageXSLTMIT LicenseMIT

XSD2HTML2XML

Generates plain HTML5 forms from XML schemas (XSDs). Transforms filled-in forms into XML.

XML schemas contain a wealth of information about what data is allowed in an XML structure, and thus how a user interface should be presented. HTML5 supports many new input types and attributes that are compatible with XML schemas. XSD2HTML2XML automates the process of generating forms from XML schemas and extracting valid XML from them after users fill them out. This makes user-generated entering of well-formed, valid XML input easier than ever before.

In a nutshell:

  • Generates a plain HTML5 form from any XML schema (XSD);
  • Generates schema-conformant XML from filled-out forms;
  • Supports populating the generated form with data from an XML document;
  • Supports namespaces (including combining schemas through xs:include and xs:import tags);
  • Written in fast and widely supported XSLT 1.0 (which means it can run client-side in browsers);
  • Has no dependencies;
  • Generates pure HTML5 forms with vanilla JavaScript for interactivity;
  • Is easily stylable with CSS, or extendable with any library or framework;
  • Is free for any purpose (MIT).

Versions

Source Code

It is recommended to always use the latest release, as the latest commits may contain experimental or untested features.

  • Version 3: a modular rewrite that is much easier to maintain, debug, and implement;
  • Version 2 (deprecated): first version with namespaces support;
  • Version 1 (deprecated): original release.

Software

Features

Supported XSD Structures & Datatypes

  • Simple and complex elements, attributes, exclusions, restrictions, groups, etc. The full list of supported XSD tags is as follows: all, attribute, attributeGroup, choice, complexContent, complexType, element, extension, import, include, group, restriction, schema, sequence, simpleContent, simpleType, union (partially).
  • minOccurs and maxOccurs, including vanilla JavaScript snippets that handle inserting and deleting elements.
  • Default and fixed values, required and optional attributes.
  • All restrictions that can be supported by HTML5: enumeration, length, maxExclusive, maxInclusive, maxLength, minExclusive, minInclusive, pattern, totalDigits, fractionDigits, and whiteSpace.
  • Practically all data types that can be supported by HTML5: string, normalizedString, token, language, byte, decimal, int, integer, long, short (including their positive, negative, and unsigned variants), date, time, dateTime, month, gDay, gMonth, gYearMonth, gYear, gYearDay, hexBinary, base64Binary, anyURI, double, float, boolean. Note that all other data types are rendered as input[type=text] boxes, which still makes them editable in most cases.
  • Namespaces: XSD files can reference other XSD's through include and import tags. Working with those is supported from version 2 onwards.
  • Custom (multi-language) labels for elements, using the xs:annotation/xs:documentation tags directly following it.

Limitations on XSDs

  • any and anyAttribute are ignored.
  • Mixed content (i.e. elements that can contain plain content and elements intermittently, designated by mixed="true") is not supported.
  • Restrictions on union elements, because they can contain content originating from different base types.
  • A type reference to another XSD must be accessible, or an element will not be generated. So, if you declare an element with a type in a different file, make sure there's an import or include tag that points to the corresponding XSD file. Multiple XSD references in a xsi:namespaceLocation attribute are not supported.
  • Namespaces loaded from external documents must have a declared prefix in the original XSD.
  • elementFormDefault and form are ignored. All elements are supposed to be in the namespaces indicated by their hierarchical position in the document (i.e. elementFormDefault="qualified" is assumed). attributeFormDefault is supported.
  • Recursivity in named types: if complexType A allows for an element of complexType B, which allows for an element of complexType A, an infinite loop is created.

Implementation

Be sure to download the latest release; you need all files included in the ZIP, except for those inside /deprecated and /examples.

From there it's very straight-forward: transform your XSD or XML file with xsd2html2xml.xsl, and an HTML5 form is generated. You can pass either an XSD file to xsd2html2xml.xsl, or an XML file which references an XSD file through xsi:schemaLocation or xsi:noNamespaceSchemaLocation. In the latter case the content from the XML document is used to populate the generated form.

XSLT Processor Support

All XSLT processors listed below are (partially) supported. To load documents on the fly, an XSLT extension with a nodeset function is required. For the processors listed below, nodeset-xxx.xsl files are provided. Be sure to include the correct nodeset-xxx.xsl file in xsd2html2xml.xsl depending on your implementation.

XSLT Processor Nodeset File Support Comments
libxslt (Webkit browsers) nodeset-exslt.xsl Partial: transformations must always be started from root XSD file, even when using an XML document referencing its schema. To populate forms in this scenario, add the XML file to the data-xsd2html2xml-source attribute of the meta[name='generator'] HTML element after the form has been generated.
MSXML / MSXSL (IE, Edge) nodeset-msxsl.xsl Full
XslCompiledTransform (.NET) nodeset-exslt.xsl Full
Saxon nodeset-xslt2plus.xsl Full
Transformiix (FireFox) nodeset-exslt.xsl Partial: namespaces are not supported. See this FireFox bug.
Xalan nodeset-exslt.xsl Full

Configuration

config.xsl

The config.xsl file contains some parameters that can be configured for your situation.

  • config-debug: determines the debug messages written during execution. Multiples values allowed:
    • INFO: information messages (through template 'inform');
    • STACK: stack trace (through template 'log');
    • ERROR: error messages (through template 'throw').
  • config-root: for XSD schemas that contain multiple root nodes, determines the number of the root node to be used. Defaults to 1;
  • config-callback: contains the JavaScript function that is called when a user submits the form. It should accept a string argument containing the generated XML.
  • config-title: contains the title given to the generated document.
  • config-script: optionally contains the URL to a JavaScript reference, which will be referenced in the generated document.
  • config-style: optionally contains the URL to a CSS reference, which will be referenced in the generated document.
  • config-documentation: specifies whether element's annotation/documentation tags should be used for descriptions (works together with config-language). Defaults to false, i.e. uses element's @name or @ref (unprefixed) attributes as descriptions.
  • config-language: optionally specifies which annotation/documentation language (determined by xml:lang) should be used for descriptions. Defaults to none.

XSD Schemas

Most elements defined in XSDs will render fine without any configuration: [type=xs:int] will become input[type=number] elements, [type=xs:boolean] will become input[type=checkbox] elements etc. Note that some elements support configuration or have peculiar behavior, however:

  • xs:string: by default, this is rendered as an input[type=text] element. If you would like to support multiline and render a textarea instead, you have to specify allowance of line breaks specifically in the pattern by including '\n'. Note that the pattern can be anything, as long as it contains a '\n'. The simplest way to do this is by adding '(\n)?' after a pattern. A multiline pattern with no further restrictions could look like this: '.*(\n)?'.
  • xs:duration: durations are rendered as input[type=range] elements, which look like sliders in most browser implementations. Durations have to follow a specific format according to W3C's specification. This format can be (partially) included in a pattern restriction. This pattern is used by xsd2html to determine the smallest unit that needs to be supported. For example, to use a duration that supports hours and minutes, add this pattern: 'PT\d{2}H\d{2}M'. The rendered range will be scaled in minutes (following the last M). To further restrict this duration to a maximum of 1 day, specify maxInclusive following W3C's notation in the smallest scale (i.e. minutes): 'PT1440M' (=60 minutes * 24). Note that in order to generate a valid value, the pattern of an xs:duration type must be specified explicitly.
  • xs:hexBinary & xs:base64Binary: these types are rendered as input[type=file] elements. For security reasons, browsers do not allow these elements to have default values. That means that, if an input[type=file] element has a default, fixed, or populated value, this is not shown to the user. If such an element is required, it could never be submitted with the default value. To solve this, the required attribute of input[type=file] elements is added only after the user has changed the populated value.
  • xs:enumeration: any type with this restriction will become a select element. It's possible to define additional restrictions on input, but usually this doesn't make much sense because the input is restricted to predetermined items.

Examples

These examples demonstrate an HTML5 form generated from an XML schema. The resulting XML is then used to populate the form again as a last step.

The first example (complex-sample) demonstrates all supported data types. The second example (namespaces-sample) illustrates an XML schema importing two documents with another namespace, and including one with the same namespace.

XML Schema (XSD) Generated HTML form Generated XML Filled-in HTML Form
complex-sample.xsd form.html complex-sample.xml form-filled.html
namespaces-sample.xsd (import-doc1.xsd, import-doc2.xsd, double-import-doc.xsd, include-doc.xsd) form.html namespaces-sample.xml form-filled.html

Customization

In case you want to add custom functionality to the generated form, I highly recommended you to do so using JavaScript and CSS, and not to directly alter the XSLT. This project frequently has new releases and updating is a hassle if you have custom functions built-in. Use the config-script and/or config-style parameters to generate HTML elements referring to external JavaScript or CSS files. See the information on config.xsl for more information.

To access data that is not automatically placed in the form, use appinfo elements in your XSD. Any data stored in such an appinfo element is converted into data-appinfo-... HTML attributes. For example:

<xs:element type="xs:string" name="string" default="singleline string">
	<xs:annotation>
		<xs:appinfo source="https://github.com/MichielCM/xsd2html2xml">
			<class>element-with-extra-data</class>
		</xs:appinfo>
		<xs:appinfo>
			<identifier>abc123</identifier>
		</xs:appinfo>
	</xs:annotation>
</xs:element>

Please note that appinfo elements with their source referring to https://github.com/MichielCM/xsd2html2xml are added to the HTML element directly, without a data-appinfo- prefix. The above code leads to the following generated HTML:

<label ... class="element-with-extra-data" data-appinfo-identifier="abc123">
	<input ... >
</label>

These attributes can be accessed through JavaScript or CSS and displayed or transformed at will:

label.element-with-extra-data:after {
	content: attr(data-appinfo-identifier);
}

Under the Hood

The third version of XSD2HTML2XML works in a modular infrastructure, to make development, maintainenance and implementation easier. The main file, xsd2html2xml.xsl, includes all other files in these directories:

  • matchers: these files form the starting point of parsing an XSD schema. Each element gets matched by one of the templates in these files, configured, and forwarded to one of the handlers for further rendering;
  • handlers: these files take care of the actual rendering of HTML. All elements are treated either as complex elements or as simple elements. The remaining files deal with the specific input, textarea, or select elements configuration.
  • utils: these files contain generic templates to handle namespace documents, string manipulation, type determination, etc.;
  • css: these templates contain CSS stylesheets for styling the generated form;
  • js: these templates contain JavaScript that handles the interactivity of the generated form:
    • event-handlers: these functions respond to button clicks relating to adding or removing form elements etc.;
    • html-populators: these functions populate the form with XML content. An XML document should be specified by storing it in the data-xsd2html2xml-source attribute of the meta[name='generator'] element;
    • initial calls: when the HTML finishes loading, these calls are executed;
    • polyfills: contains polyfills for early browser support;
    • value-fixers: contains functions to set values to HTML elements that differ from their XML counterparts. The XML-specific date format differs from the input[type=date] HTML elements, for example;
    • xml-generators: these functions generate XML from a submitted form.

forward & forwardee

In XSLT 1.0 processors, dyamically created XML structures (e.g. for documents loaded on the fly) need to be converted to nodesets before they can be used. Different processors support different functions for this purpose. Rather than prescribe one implementation, I use specific nodeset-xxx.xsl files to handle to nodeset conversion. This means that any template calling get-namespace-documents, for example, has to run its result through the forward template to be able to work with it as a nodeset. That is why some templates have xxx-forwardee counterparts, which are called through the forward template. See this thread for details.

FAQ

  • Why a version 3.x?
    Version 3 does not bring a lot of new features over version 2, but it's a lot more efficient and future-proof. XSLT is not well-suited to creating projects of this scale, and having different files at least provides some sort of separation of concerns. The rudimentary stack trace really helps debugging and maintaining. Apart from that, there is now just one version to use in any scenario.
  • How is version 3.x different?
    For details, check the release notes for each release. In general (as opposed to version 2.x): included and imported documents are kept in memory whenever possible, and not loaded every time they are referenced; form population is done through JavaScript, removing the need for a [dyn:]evaluate XSLT function; support for all common XSLT 1-3 implementations (including browsers); support for XHTML and generating XML through XSLT has been removed.
  • Are there any known bugs?
    Please see the issue list.
  • Will this work with any XML schema?
    Yes, as long as you don't use the more esoteric elements of XSD, such as field or keygen. See the full list of supported tags above.
  • Do I have to annotate my XML schema?
    No, but you can to override the default labels. By default, the name attribute of elements is used for labels. If you want a custom label, add an xs:annotation/xs:documentation containing your custom label to the element.
  • Which browsers are supported?
    HTML5 support is steadily increasing with every browser release, so the more modern the browser, the better. However, generated forms have been confirmed to work in IE9, IE10, IE11, Edge, Chrome, Firefox, and Safari.
  • But gDay and gMonth don't work in Edge!
    They don't out of the box, because the format these types require (e.g. --03 for March) are not valid numbers and Edge refuses to set them as values. A workaround is to use an enumeration for these types, as shown in complex-sample (gMonthEnum).
  • I can't edit xs:long values in Chrome!
    The upper and lower bounds of long values are too high for Chrome to work with. Either use another browser or comment out the bounds in the set-type-specifics function for the xs:long type.
  • What's the easiest way to test this?
    Please see my website for a free online implementation or an offline Java application.