Proposal : Generic Type System

Question

Proposal : Generic Type System

Opened this issue 19 days ago · 0 comments

Problem

Current XML based system for describing node definitions and node graphs requires each specialization of the node to be spelled out explicitly, e.g. ND_add_float, ND_add_color3, ND_add_vector2, etc. This leads to a verbose data library, and encourages copy/paste authoring which can be prone to errors.

The size of both the source and the installed artifacts is also a consideration for the MaterialX project, and if we can reduce the footprint of the project, that would benefit several downstream MaterialX integrations.

Finally, because of the exhaustive nature of the current system, adding a new type to the system becomes incrementally more difficult each time, due to having to add new node definitions for each of the general purpose nodes. ND_add_myType, ND_multiply_myType, ND_image_myType, etc. This problem becomes combinatorially worse if there multiple templated types involved.

Solutions

The general theme of the possible solutions below is to develop a system that describes the different specializations of a given node with a single, or small number, of descriptions, referencing a set of types that are valid for a given description.

Build time template system

The simplest solution proposed here would be a build time template based system. Here we propose augmenting the Data Library source files with additional information describing families of types, that could then be used to concretely instantiate the node definitions currently required by the system during the building of the MaterialX project.

A build time template system could be built upon:

An existing template based systems, such as Jinja (https://jinja.palletsprojects.com/en/stable/).
A bespoke metadata based system using custom syntax embedded in the Data Library files, or perhaps side-car files or pure python.
An extension of the existing XML syntax, that is processed and then removed during the build stage.

Extending the XML syntax would feel the most natural option to use to progress forward in to the next solution, though perhaps would be the most cumbersome to implement as an initial step. XML is not the easiest syntax to describe loops or other control flow within, and other systems perhaps more python based might lead to a simpler, cleaner source syntax for the Data Library.

A build time solution has the significant benefit of being incremental. It would not require any existing downstream MaterialX integrations to adopt any changes.

Runtime polymorphic system

Another possible solution would be a runtime polymorphic system. We could extend the XML syntax to describe the necessary the relationships between the types and/or explicitly define groups of types that could be used inplace of concrete types in node definitions.

Example

<template name="numeric" options="float, vector2, vector3, vector4, color3, color4" />

<nodedef name="ND_range_numeric" node="range" nodegroup="adjustment">
  <input name="in" type="numeric" value="zero" />
  <input name="inlow" type="numeric" value="zero" />
  <input name="inhigh" type="numeric" value="one" />
  <input name="gamma" type="numeric" value="one" />
  <input name="outlow" type="numeric" value="zero" />
  <input name="outhigh" type="numeric" value="one" />
  <input name="doclamp" type="boolean" value="false" />
  <output name="out" type="numeric" defaultinput="in" />
</nodedef>

<nodedef name="ND_multiply_float" node="multiply" nodegroup="math">
  <input name="in1" type="numeric" value="0" />
  <input name="in2" type="float" value="1.0" />
  <output name="out" type="numeric" defaultinput="in1" />
</nodedef>

<nodedef name="ND_multiply_numeric" node="multiply" nodegroup="math">
  <input name="in1" type="numeric" value="0" />
  <input name="in2" type="numeric" value="1" />
  <output name="out" type="numeric" defaultinput="in1" />
</nodedef>

(Credit conversation with Jonathan Stone).

Here we see one possible example of defining a <template> element with candidate types named, and then type definitions using that named template type. This would manifest at runtime with any of the candidate types being a valid concrete type passed inplace of the template name. i.e. you could connect a float output in to the in port of type numeric above.

Build time AND runtime template system - XML based

The final possible solution proposed here, the most complex, is a combination of the two above. It would be possible to create a XML based system that could be baked out to concrete definitions at build time. That same syntax could, respecting some build time configuration, be installed unmodified and used in MaterialX library that is capable of using a runtime polymorphic system as described above.

The authors of this proposal think this solution is likely the end goal of this work, but they also think that trying to jump right to this solution is probably too large a singular piece of work to take on in one go. Instead, this solution is discussed here to help the community and the developers on the project keep in mind this potential end goal.

Additional Considerations

Changing the way that node definitions, nodegraphs and potentially even types are described could be a pretty fundamental change, and so we need to take care to consider potential consequences.

Named Values

Absent from this proposal is any discussion around how to handle the differing value strings for the varying types, 0.0 for float vs 0.0, 0.0, 0.0 for color3 etc. This is discussed in a separate related Named Values proposal. This proposal is separate, as it could be designed and implemented as a separate orthogonal system, with its own merits. If both proposals are undertaken, there will likely be overlap between them.

Increased cognitive complexity

The current system, while verbose is very straight forward to understand, parse and process. Even if the new solution is still purely XML based, its inevitable that reading the source will be more complex, and possibly even the installed data library could still retain polymorphic information which would be more complex to parse and process.

Runtime changes required

MaterialX is an important component in several pieces of commercial software. Any change to the file format used to describe the Data Library, or MaterialX documents themselves need to be carefully considered in terms of the impact to those downstream consumers, and weighed against an increased burden of work for adoption. OpenUSD is a particularly important example here.

This consideration is one reason to strongly consider a build time solution over a runtime solution, as we would end up installing the same set of Data Library files as we do today, just with a hopefully simpler and less redundant set of sources.

Additional dependencies

Jinja or any other external system used at build time (or runtime) to implement this generic type system will increase the dependency complexity of the MaterialX project. The project currently takes a "batteries included" approach, where all that is required is to clone the repository, and everything necessary to build is provided (though obviously things like Python and Emscripten are needed for those bindings).

This consideration could be mitigated if we used this system to generate the artifacts in the repository itself. usdGenSchema operates in a similar way, generating files that are then committed back to the repository, thus removing the strict build-time dependency if a user just wants to build the existing defined schema.

Conclusion

It is the opinion of the authors of this proposal that a generic type system for the MaterialX Data Library would be a significant improvement to the MaterialX ecosystem, any of the above solutions would be a step in a positive direction. The authors suggest taking an incremental approach and validating at each step, thus lowering the risk and complexity. Starting initially with an XML based build time system, those source files could then be used to prototype a runtime system.