Early discussion around asset variants
zellski opened this issue · 40 comments
Introduction
One of the use cases for 3D assets we often see referenced is the concept of compactly represented, easily selectable variants. This has obvious applications across the board, but has seen increased attention in the burgeoning realm of 3D-assisted commerce, whether on flat surfaces or in AR/VR/XR/MR. The AR use case is particularly interesting, because it comes with strict fidelity requirements in real-world measurements.
For a vivid example, consider the product page for a pair of sneakers: available in six colour combinations and a range of eight sizes, and additionally should be displayed atop a small chunk of ground plane that could be asphalt, track or trail. For extra credit, the sizes are not simple homogenous rescalings of the canonical asset, but also include subtle changes in geometry that need to be correct.
We could generate 6 * 8 * 3 assets, one per permutation. We could be a little more clever and load the ground plane asset separately, but we'd still end up with 48 assets, which seems prohibitive, and silly given the enormous redundancy across the assets.
The Challenge
So we'd like to find a way to package all these variants up in a single asset in some way that maximises reuse of shared data. The efficiency with which we can do this is critically important: if it bloats the asset too much, then the initial time-to-interaction takes a blow, and the scheme becomes counter-productive.
A compliant loader would need the abililty to (very) quickly switch between assets variants on demand, at runtime.
To have real-world meaning, the representation scheme (which would surely take the shape of an glTF extensions, or perhaps a few interacting ones) must also be crafted with tooling in mind. It must not be prohibitively difficult or error-prone to generate and process these packaged-up assets, either for content applications tools, optimisation pipelines, or the actual humans tasked with creating the assets.
A final challenge is that interest in this realm is growing quickly, we can imagine that inside many tech companies efforts are already brewing to come up with ad-hoc solutions, to the ultimate detriment of retailers, who can't realistically be asked to produce a different asset for each platform. Balancing the need to for platforms to move quickly against the urge for future consensus will take work and discipline.
Some obvious possibilities
While this post is intended to throw open discussion, some solutions are so obvious that to omit them seems disingenuous:
- Using simple tag strings to identify variants.
- This lends itself well to composition, where variant-enabled sneakers rest atop a variant-enabled ground plane, and two tags are used to identify the combination.
- The ability to simply switch a mesh primitive from pointing to one material in the asset to another.
- An interesting question is whether one might also wish to be able to switch between different sets of per-vertex colour.
- Even more dynamic solutions immediately come to mind, e.g. allow the recursive, tagged overwriting of any subset of mesh primitive properties. This yields maximal expressive power, but would surely make optimised client implementations increasingly complex.
- For geometric variations, we could easily imagine tapping glTF's existing animation support -- skinning/skeletal and morph targets -- with simple static poses. This has the great advantage that it's already implemented in compliant loaders/viewers/pipelines, and we'd only need to couple variant tags with animations (as @donmccurdy mentions, this is not unrelated to the EXT_property_animation extension proposal).
- Finally, tapping the power of the scene graph seems obvious: for simplicity, perhaps allow per-variant switching of the glTF
scene
; for a more general solution, again at some possible cost in client complexity, allow per-node switching. This variant-enables all the power of node transforms, mesh selection, point light setups, and so forth.
As @lexaknyazev has pointed out, how we best navigate the solution space likely depends on what's reasonably achievable in tooling (more so than what seems at first blush to yield the optimal combination of expressive power and client complexity.) To that end, we should work with tooling companies to get a sense of what is achievable; we should also consider the possibility of a reference implementation of a tool that would combine distinct asset exports (with as much commonality as possible) into a single, new, variant-enabled asset.
The immediate future
Facebook (my employer) may have rapidly emerging needs here in the next half-year or so. To that end, we'll likely write up a vendor extension –– FB_asset_variants
or something –– to unblock ourselves, and we will post the specification here... with the understanding that the goal down the road should be at something of broad applicability in the EXT_ namespace, and ultimately something that can be fully ratified.
Meanwhile –– what does everyone else yearn for? What should we be cautious about? What unnerves engine implementors and tool makers?
For a completely hypothetical tl;dr vivid example, we could easily imagine material-switching looking something like
"meshes": [
{
"primitives": [
{
"attributes": {
"POSITION": 0,
"NORMAL": 1,
"TEXCOORD_0": 2
},
"mode": 4,
"indices": 3,
"material": 0,
"extensions": {
"FB_asset_variants": [
{
"tag": "nike_revolution_navy",
"material": 0,
},
{
"tag": "nike_revolution_black",
"material": 1,
}
]
}
}
]
}
],
or equally simply:
"scenes": [
{
"nodes": [
0
]
}
],
"scene": 0,
"extensions": {
"FB_asset_variants": [
{
"tag": "nike_revolution_laced",
"scene": 0,
},
{
"tag": "nike_revolution_unlaced",
"scene": 1,
},
]
}
It'd be tempting to scatter such switches all over the place –– but in the interest of keeping optimised viewers easy to write, we would probably want to limit it to a few carefully selected places.
Thanks for opening up this issue @zellski! Shopify is also really invested in figuring this out as so many of our merchants rely on having different variants for their products.
A few weeks ago I took a stab at what an extension could look like, and it was based off of how USD supports variants (https://graphics.pixar.com/usd/docs/Authoring-Variants.html).
You can try out the live demo here: https://pushmatrix.github.io/gltf-variants-extension/
Code is here: https://github.com/pushmatrix/gltf-variants-extension
I took an approach where the variants are stored on the materials themselves.
"materials": [
{
"pbrMetallicRoughness": {
"baseColorFactor": [
0.800000011920929,
0.0,
0.0,
1.0
],
"metallicFactor": 0.0
},
"name": "Material",
"extensions": {
"SHOP_variants": {
"Color": {
"Red": {
"baseColorFactor": [1,0,0,0]
},
"Green": {
"baseColorFactor": [0,1,0,0]
},
"Blue": {
"baseColorFactor": [0,0,1,0]
}
},
"Finish": {
"Glossy": {
"roughness": 0
},
"Matte": {
"roughness": 1
}
}
}
}
}
],
For every given material, you would list all the variant options that can affect that material. So for example, if you have the "Red" value selected for the "Color" option, it would override the base color with [1,0,0,0]. If you have the "Glossy" value selected for "Finish", it would override the roughness with value 0. You could do the same with textures.
For configurators, I think it's really important that we capture the names of the possible options. It's easy for a viewer to load the above extension and see that there are 3 possible colors, and two possible finishes. That allows for mixing and matching between the options vs if we specified variants called nike_green_glossy
and nike_green_matte
.
Here's an example of a glTF combining two properties overrides into one. The option is called Style, and the possible values are "Regular" and "Deluxe".
"materials": [
{
"pbrMetallicRoughness": {
"baseColorFactor": [
0.800000011920929,
0.0,
0.0,
1.0
],
"metallicFactor": 0.0
},
"name": "Material",
"extensions": {
"SHOP_variants": {
"Style": {
"Regular": {
"baseColorFactor": [1,0,0,0],
"roughness": 1
},
"Deluxe": {
"baseColorFactor": [1,1,0,0],
"roughness": 0
}
}
}
}
}
],
Regular is red and matte, deluxe is yellow and glossy.
Note the glossiness can be seen by the white highlight that appears on the cube.
This does give a ton of flexibility, but the tooling will need to be done in a way to make authoring this simple.
the issue is that real product variants result in physical (shape) as well as appearance alternatives, hence a model like USD is much more flexible. You should also note that not all variants work with all others, so configurators have complex rule sets eg. if you have A, you can have color B,C, But if you choose C, you cant have option D. The result is a single gltf file could represent a very simple product, but a scalable solution would require a means to compose the product structure (node graph) dynamically. USD is looking to be a good candidate for this.
Thanks for starting this discussion, @zellski. Adobe is definitely interested in this and I would at least like to discuss defining a multi-vendor, EXT extension sooner rather than later. I've started sketching out designs for an extension a couple of times already but always get stuck at the conditional logic. Configurators get very complex, very quickly. I always think about car packages and how certain features are only available when certain other packages have also been selected. There are prerequisites and also antirequisites for each choice. What's the best way to cleanly specify that in this extension?
Of course, we could always separate the available options from the conditional requirements and leave that part to fall on the application-level logic? So, if I took my car model and loaded it in another glTF viewer, maybe I'd be able to configure it in a way that doesn't represent a car that you can actually buy. Is that a problem?
I'm with @MiiBond's suggestion of letting that be application-level logic.
I'm not sure if it's the glTFs responsibility to contain that information since it could be a rabbit hole of complexity. For example, what happens if we want to represent price? Would the pricing of a product be baked into the glTF? Different configurations have different prices, so that would need to be included in the data model. Then what happens when you have multi-currency? The file would get super bloated.
I think it's best to leave the application to deal with the possible combinations.
Shopify is also onboard for discussing a multi-vendor EXT.
IKEA is also on board on the multi-vendor EXT!
Thanks folks for comments.
I think it's wise to be sticklers for separation of concerns here. The glTF extension should be provide expressive power to the application in as dynamic a way and with as simple an API as is feasible. It should not try to make inroads on application logic itself, or we'll end up with with one of those horrors where increasingly cryptic logic, written in painstaking JSON, becomes unreadable long before it becomes powerful enough to really do its job.
This is why I focused on simple string-based tags, and on allowing multiple tags to be active at once. Then one set of tags can control e.g. geometric deformation, another set of tags can –– quite separately –– control the material properties of the shoe laces, while another set controls the overrides of the ground plane. Furthermore, we can then stick any variant-enabled sub-models together, and build our configurator UIs through composition as well, and things will just work.
I think accepting human-readable labels in the glTF spec is fine –– it would be nice if you could drop a model into a generic web-based viewer, and the viewer would present you with a debug-quality interface for selecting variants –– but in practice, for non-trivial use cases, I think such labels would really come from the application (for i18n reasons if nothing else). Labels shouldn't replace tags as keys; tags should be succinct and logically composited, whereas human-readable strings should be allowed to vary according to whim.
Whether we put the extension on materials, and override material properties, or put the extension on the mesh and override the material reference –– I'm pretty agnostic. The latter feels more powerful, at the cost of some redundancy, but the former may be more intuitive. Runtime performance needs to be considered pretty carefully here –– we don't want to have to reload a model when a selection changes; we don't want to do anything that might flicker or freeze. We shouldn't add requirements to a simple extension that will be onerous for some early engines to optimise for.
I work with Zell at FB, so we are on the same page about our need for a solution. Great to see all the involvement here from other companies! I think having the variants within the materials themselves adds complexity in a couple areas:
- When a material is used multiple places, it would require duplication of the material if variants are different in the multiple places. For example, in this shoe, the blue suede is present multiple places, but if only the lacing area has multiple variants, then the material would need to be defined twice, one with variants and one without, and then it would get trickier to identify that this is actually the same material that is already loaded into the shader. I think it would be easier to specify this with the variants attached to the specific mesh segments.
-
Another complication may arise with partial/multiple overrides. Consider this case modified from above:
"SHOP_variants": { "Color": { "Red": { "baseColorFactor": [1,0,0,0] "roughness": 0 }, "Green": { "baseColorFactor": [0,1,0,0] }, "Blue": { "baseColorFactor": [0,0,1,0] } }, "Finish": { "Glossy": { "roughness": 0 }, "Matte": { "roughness": 1 } }
Is this allowed? What is the behavior if gltf file then tries to apply "Red" & "Matte"? Arbitrary?
This complication also begs the question - what would be the easiest way for modeling software to expose variant creation? I think it would be much easier to allow a user to create multiple named materials and assign them as "variants" to the mesh. I think it would be harder to control modifying single parameters and then somehow ensuring that they combine exclusively.
Hi @debuggrl, great to have you part of the convo!
-
I agree that this would cause lots of duplication if you want to share materials across meshes. I'd be curious to see if more often than not materials will have textures assigned via UV maps, so sharing them wouldn't be as common.
-
For overrides, I think it would happen in the order that they are defined in. That's what I believe USD does with their variant sets (https://graphics.pixar.com/usd/docs/USD-Glossary.html#USDGlossary-VariantSet). I still need to look further into how they handle their overrides. Seems like with USD you can override pretty much anything, and you define those overrides within those variantsets.
What would help this discussion is picking a single baseline / realistic scenario to compare the approaches against, like the shoe you mentioned.
I've been thinking a lot about this one by Sketchfab: https://demos.sketchfab.com/clients/nike-configurator/index.html.
6 different parts of the shoe that can be customized, with on average 5 colour choices per part.
Each part is more than just a diffuse colour change, it would have textures to it too. Ideally the example we choose would also have changing shape too, and not just textures.
In Zell's initial suggestion, would you have an extension for each part of the mesh?
"FB_asset_variants": [
{
"tag": "nike_base_white",
"material": 0,
},
{
"tag": "nike_base_blue",
"material": 1,
},
{
"tag": "nike_base_red",
"material": 2,
}
]
}
.......
"FB_asset_variants": [
{
"tag": "nike_midsole_white",
"material": 3,
},
{
"tag": "nike_midsole_blue",
"material": 4,
},
{
"tag": "nike_midsole_red",
"material": 5,
}
]
}
And so in a gltf viewer, when you click the midsole, it would look up the variants associated to that mesh, and give you the 3 options available to it (nike_midsole_white, nike_midsole_blue, nike_midsole_red)?
If the mesh is also swappable (like switching out the type of sole), I'm guessing those would need to be duplicated?
Material Switching
I've come to feel pretty srongly that all possible materials should be generated explicitly in the glTF. Allowing the override of specific properties within the material introduces a whole slew of complexities and potential inconsistencies, many of which @debuggrl outlined above. I suspect it would result in fewer existing viewers implementing the extension. Explicitly duplicating materials in the glTF file will be more verbose, but after gzip/zstd it should not add much to de-facto file size over network.
Real-World Use Cases
This would be really useful. I also think an exploratory implementation in one of the WebGL viewers would be ideal (though obviously a bit of work). It's possibly something I could take on, in addition to the in-house work experiments @debuggrl is working on.
And so in a gltf viewer, when you click the midsole, it would look up the variants associated to that mesh, and give you the 3 options available to it (nike_midsole_white, nike_midsole_blue, nike_midsole_red)?
Yes. I imagine an implementation would gather up FB_asset_variants
information while the glTF is being parsed, and stash that away in some new data structures / object properties. We could easily build a list of all available tags in that phase, and for performance we could build some kind of hashmap of tag -> set of sub-objects that need to be updated.
In a first implementation, it's probably OK re-traverse the scene graph whenever tags switch
(I'm more worried that some engines will be limited in the amount of runtime redrawing they're equipped to handle once the loading process is over. In e.g. three.js though, that part should be close to trivial, if memory serves.)
Non-Material Changes
There are some new complexities here. At GDC I had good conversations about texture scale. Basically, we can pretty much never "stretch" a texture –– it simply doesn't look real. A sneaker size 11 will almost certainly need a a different UV map than a sneaker size 10.
One pragmatic solution would be this:
"FB_asset_variants": [
{
"tag": "nike_midsole_white",
"material": 3,
"TEXCOORD_0": 13,
},
{
"tag": "nike_midsole_blue",
"material": 4,
"TEXCOORD_0": 14,
},
{
"tag": "nike_midsole_red",
"material": 5,
"TEXCOORD_0": 15,
}
]
in which case engines just have to know that TEXCOORD_0 refers to the per-vertex attributes, and we resist the urge to generalise this further (although there's a reasonable case for COLOR_0 too).
However, there is another option. We could move the whole switching apparatus to the node level. It would look something like:
"extensionsUsed": [
"FB_asset_variants"
],
"nodes": [
{
"mesh": 0,
"extensions": {
"FB_asset_variants": {
"meshes": [
{
"tag": "nike_midsole_white",
"mesh": 0
},
{
"tag": "nike_midsole_blue",
"mesh": 1
},
{
"tag": "nike_midsole_red",
"mesh": 2
}
]
}
}
}
],
"meshes": [
{
"name": "nike_midsole_white_mesh",
"primitives": [
{
"attributes": {
"NORMAL": 1,
"POSITION": 2,
"TEXCOORD_0": 13
},
"indices": 0,
"mode": 4,
"material": 3
}
]
},
{
"name": "nike_midsole_blue_mesh",
"primitives": [
{
"attributes": {
"NORMAL": 1,
"POSITION": 2,
"TEXCOORD_0": 14
},
"indices": 0,
"mode": 4,
"material": 4
}
]
},
{
"name": "nike_midsole_red_mesh",
"primitives": [
{
"attributes": {
"NORMAL": 1,
"POSITION": 2,
"TEXCOORD_0": 15
},
"indices": 0,
"mode": 4,
"material": 5
}
]
}
],
So this adds more explicit duplication within the JSON itself, but is also helped by the fact that existing engines (at least ones that implement references to shared data properly, without copying it) are very good at these simple index-based references from one object to another.
I've been thinking a lot about this one by Sketchfab: https://demos.sketchfab.com/clients/nike-configurator/index.html.
This is indeed very interesting and instructive. There is a (less impressive but) simpler variant of Sketchfab's configurator here.
The entire configurator is on the order of a hundred lines of code. It's extremely simple & the source is right here on GitHub. On startup, it identifies the nodes that should be switchable. At runtime, it simply turns nodes on and off depending on the selected option.
So this example does a form of switching at the node level, like my last example.
I have to think through the node-level switching a bit more. I think the concern I have is that for many items - handbags, furniture, world-objects - that are not clothing or shoes, when we change a material, we aren't changing the mesh, and when we change the "size", it should actually be an entirely different model (single seat chair vs. loveseat should be a different model for example. Clutch should be different model than crossbody purse, even for the same "stylename"). So in these cases, if we just change the material, I want to make sure that we don't incur any additional cost of reloading vertices into a vertex buffer. Maybe we can do that with node-level? I'm not sure though. Seems like it would be hard to guarantee that the vertices have not changed. I haven't spent too much time on it though. Might be time to write a viewer implementation to try it out.
I think the concern I have is that for many items - handbags, furniture, world-objects - that are not clothing or shoes, when we change a material, we aren't changing the mesh, and when we change the "size", it should actually be an entirely different model
Yeah. There may be virtue in keeping this FB namespace extension very simple, e.g. you can switch out the material of a mesh primitive, and that's it.
Everyone: do we have a sense how good support is in practice for shared vertex data, among existing glTF loaders & associated engines? Specifically, if I instantiate two different mesh primitives that both reference the same vertex attribute accessor, how many engines actually copy the data, when they should just share a reference to it?
Yeah. There may be virtue in keeping this FB namespace extension very simple, e.g. you can switch out the material of a mesh primitive, and that's it.
Definitely would accommodate for 90%+ of usecases while keeping implementation lean and making it more likely for viewers to implement it. Supporting what @debuggrl said, most products on Shopify stores that have variants are either simple material changes or they are completely different models. There's not as many complex configurator products.
I think, in the case of clothing/product sizes, you'd want to change the rest of the mesh in addition to the UV's. Different shoe sizes have different proportions and details like the eyelets, laces, etc. don't scale with the rest of the shoe.
Would it really be too complicated to allow both mesh variants and material variants? I was also imagining a system where you could have multiple tags per option. In my car configurator example below, the Base and Sport packages share the same body mesh but have different spoilers.
"extensionsUsed": [
"FB_asset_variants"
],
"nodes": [
{
"name": "car_body",
"mesh": 0,
"extensions": {
"FB_asset_variants": {
"meshes": [
{
"tag": [
"base_package",
"sport_package"
],
"mesh": 0
},
{
"tag": [
"premium_package"
],
"mesh": 1
}
]
}
}
},
{
"name": "spoiler",
"mesh": 2,
"extensions": {
"FB_asset_variants": {
"meshes": [
{
"tag": ["base_package"],
"mesh": 2
},
{
"tag": [
"sport_package"
],
"mesh": 3
},
{
"tag": [
"premium_package"
],
"mesh": 4
}
]
}
}
}
],
"meshes": [
{
"name": "base_body",
"primitives": [
{
"attributes": {
"NORMAL": 1,
"POSITION": 2,
"TEXCOORD_0": 13
},
"indices": 0,
"mode": 4,
"material": 3,
"FB_asset_variants": {
"materials": [
{
"tag": [
"metallic_red"
],
"material": 0
},
{
"tag": [
"gloss_blue"
],
"material": 1
}
]
}
}
]
},
{
"name": "premium_body",
"primitives": [
{
"attributes": {
"NORMAL": 3,
"POSITION": 4,
"TEXCOORD_0": 10
},
"indices": 5,
"mode": 4,
"material": 4,
"FB_asset_variants": {
"materials": [
{
"tag": ["metallic_red"],
"material": 0
},
{
"tag": ["gloss_blue"],
"material": 1
}
]
}
}
]
},
{
"name": "base_spoiler",
"primitives": [
{
"attributes": {
"NORMAL": 1,
"POSITION": 2,
"TEXCOORD_0": 14
},
"indices": 0,
"mode": 4,
"material": 4,
"FB_asset_variants": {
"materials": [
{
"tag": ["metallic_red"],
"material": 0
},
{
"tag": ["gloss_blue"],
"material": 1
}
]
}
}
]
},
],
Yeah. That example is good and valid. I was thinking by the time you were replacing meshes, you’ve lost most of the benefit of reuse at the cost of a larger single asset.
But car body with alterable spoilers is an excellent counter to that.
Good call on making the tag property array-valued. I think the algorithm is then “activate the first variant selection whose tag array intersects non-empty with the set of user-selected tags” with fallback to the non-extended value.
I think we’ve identified the two places we could support switching — the material reference and the mesh one.
(I suppose scene and camera could be another — though they already have APIs of sorts.)
I think we might want to survey a couple of existing glTF loaders and ask/find out how well they are able to runtime-switch materials on a mesh, and how well meshes of a node?
It'll be interesting to see where people draw the line for including all variants in a single file, vs. splitting them out. The spoiler example is a strong case for single file, but then if you a scenario where you only have 2 different meshes with their own unique materials, do you not use the extension?
I guess for .glTF it wouldn't really matter since all the resources can be external, but for .glb these files could get unnecessarily large.
Would GLBs contain the "default configuration", with everything else being external? Or does that break what GLBs are meant for? Seems to be some discussion on that here: #1117, which @zellski is a part of.
Hmm, yes. The tradeoff equations do look different in .glb and .gltf scenarios.
Hot take: Most base use cases are served well enough by GLBs, and by the time use cases get sufficiently advanced to warrant it, you can switch to .gltf & take on the extra complexity of of async on-demand loading from external buffers. We can focus here on powerful & elegant expressive power, trusting in existing/other glTF mechanisms for bandwidth optimisation.
For example, if the two car bodies (meshes #0 and #1) in @MiiBond's example above are large, they would live in separate .bin files, the engine might pre-load one before declaring the model ready, and if it was feeling fancy it could start asynchronously pre-caching the other body mesh in the background, so things are zippier when the user starts playing around with their configurator.
We should also imagine that an advanced use case could be fed some higher-level configuration data that switches between implementing GLBs, along the lines of,
{
"variants": [{
"tag": [ "base_package", "sport_package"],
"asset": "https://car-company.com/assets/base_and_sport_variants.glb",
}, {
"tag": [ "premium_package"],
"asset": "https://car-company.com/assets/premium_variants.glb",
}]
}
Such assets wouldn't even need to be exclusive to one another; there's no reason you can't load multiple GLBs into a scene, each implementing the extension, each fed the same set of user-selected tags.
(Obviously the high-level spec is not one we need to write, it's just good to keep in mind there's many ways to handle the redundancy quandary.)
PS. I don't think GLBs with external references are very useful, even if technically legal –– to my mind, the .glb suffix should signal self-sufficiency...
100% agree. .glb being self-sufficient is a must have and should be non-negotiable.
A little input from the perspective of someone who works for a company with configurators as core business (Dassault Systemes 3DExcite) --
Depending on how far you may want to go down the rabbit hole, another important component in a variant system is the ability to define 'dependencies'. When it comes to automotive configurators the dependency component is critical to anything we build that goes beyond basic color/mesh switching. To give a basic example, a 'sport package' might enable one to choose from rims 'b,c', while a premium package might offer 'a,b,c,d', but only 'a,c,d' when front spoiler 'x' is selected as well.
I feel strongly that the extension should not know about complex rules or dependencies. Enforcing rules should be up to the application to make sure to only show selectable options for what is possible.
The model itself though has no knowledge of this. If you were to load it up in a default viewer like https://gltf-viewer.donmccurdy.com/, you would be able to view configurations that are not possible.
That would keep the complexity down immensely.
After the call this morning, I was thinking about this a bit more. So far, we have been talking about mesh and material variants and, for retail, that's probably sufficient. However, many of Adobe's users are artists and are using our glTF export for design review. It occurs to me that allowing variants for lighting (punctual and IBL) and cameras is also needed. We also support custom extensions for things like static background plates that we'd probably also want to apply the variants extension to.
At least with lights and cameras, it would work exactly in the same way as in the example above. The extension would be on a node and would override light:
or camera:
instead of mesh:
.
At least with lights and cameras, it would work exactly in the same way as in the example above. The extension would be on a node and would override light: or camera: instead of mesh:.
This would be very elegant indeed. If we extended support this far, I would propose a root-level switcher on "scene" as well, for when variants have significantly different node graphs, but still share most resources.
None of this feels like feature bloat yet. I think the only danger is that, as we increase the number of places where we can switch on tag, engine compliance becomes more onerous.
As for @ikaria's suggestion above, I agree with @pushmatrix that those dependencies mainly belong in whatever higher-level configuration app vendors write. I think there might be occasions where we wish we could switch on some simple boolean combinations of tags... like, to bastardise @MiiBond's example above:
"FB_asset_variants": {
"materials": [
{
"all-of-tags": [
"premium_package",
"metallic_red",
],
"but-not-tags": [
"edgy_aesthetics",
],
"material": 0
},
...
but it seems rather contrived, and besides tags are lightweight: one could just as easily create one for each meaningful permutation.
I was looking at ease of implementation for these options in our engine. It is definitely much easier to implement the simple material switching. @zellski is correct in that the higher you go in allowing the node/scene switching, the more the engine has to track and account for, which will give greater complexity.
I think we may want to think about having multiple extensions. The simple material switching could work hand-in-hand with the mesh/light/node extension. Each one is getting us closer to the goal of highly efficient, highly flexible asset transmission.
I really like this thread. 🙌
Last night I decided to think about the option that I believe Wayfair had mentioned during our call. In this example there was a bedframe with legs. Options for the bedframe would be Queen and King. Bed leg transformations would need to change when the bedframe is changed from Queen to King.
Example:
{
"extensionsUsed": [
"SHOP_variants"
],
"nodes": [
{
"name": "bed",
"children": [1, 2]
},
{
"name": "bed_base",
"mesh": 0,
"extensions": {
"SHOP_variants": [
{
"tag": [
"queen"
],
"mesh": 0
},
{
"tag": [
"king"
],
"mesh": 1
}
]
}
},
{
"name": "bed_legs",
"children": [3, 4, 5, 6]
},
{
"name": "leg_front_left",
"mesh": 2,
},
{
"name": "leg_front_right",
"mesh": 2,
"scale": [-1, 1, 1]
},
{
"name": "leg_back_left",
"mesh": 2,
"SHOP_variants": [
{
"tag": [
"king"
],
"translation": [0, 0, 0.3]
}
]
},
{
"name": "leg_back_right",
"mesh": 2,
"scale": [-1, 1, 1],
"SHOP_variants": [
{
"tag": [
"king"
],
"translation": [0, 0, 0.3]
}
]
}
]
}
From an architectural standpoint this all works and there should be no reason why you cannot use matrix
. However I did read over the spec to ensure there were no conflicts and saw this:
When a node is targeted for animation (referenced by an animation.channel.target), only TRS properties may be present; matrix will not be present.
So in the above example if matrix
is used to define transformations everything is fine until an animation is applied to a node
in which case matrix
would need to be decomposed. Not nice but works.
So:
{
"name": "leg_back_right",
"mesh": 2,
"matrix": [-1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1],
"SHOP_variants": [
{
"tag": [
"king"
],
"matrix": [-1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0.3, 1]
}
]
}
}
Becomes:
{
"name": "leg_back_right",
"mesh": 2,
"scale": [-1, 1, 1],
"SHOP_variants": [
{
"tag": [
"king"
],
"translation": [0, 0, 0.3]
}
]
}
But it does get me thinking about how animations effect all of this. For instance what if an animation is applied to a variant and not the base node?
So @pjcozzi brought this up subject today on the glTF call, sort of asking for next steps.
My feeling is that conversation has moved beyond a FB extension at this point –– we will write one up for internal use, and document it here as a PR, but beyond that it's served its public purpose.
What we're really talking about now is what an EXT_ extension would look like, and that's obviously a different can of worms. Luckily this group of people have a lot of experience in how to design for maximum use case coverage, while keeping functionality clean & orthogonal, and drawing a line in the sand when the danger of feature bloat rears its head.
Some of this conversation should probably be had in conjunction with the exploratory 3D Commerce group, since so many of the more challenging & exciting use cases come from the retail domain.
It's probably too early to start assembling a PR for an EXT_ extension, but it seems likely helpful to summarise what semi-consensus has emerged here, and see if there's future iterative urges from there.
Thoughts?
(@mikkoh –– your comment is interesting, and deserves a reply, just needed to get this out separately.)
From my point of view, variants should behave like "overrides".
This means, any variant can explicitly change a scalar, color value, texture, material, primitive, mesh or whatever a specific variant requires.
Also, it should not directly be attached to a model. I would suggest that the variant is targeting the model via an index. Result and advantage is, that the main glTF sections stays "clean".
@UX3D-nopper could you post code example of what you mean re
variant is targeting the model via an index
Will prepare it tomorrow.
{
"products": [
{
"name": "My product A",
"variants": [
{
"tag": "My variant 0",
"meta": "My variant 0 meta information."
},
{
"tag": "My variant 1",
"meta": "My variant 1 meta information."
},
{
"tag": "My variant 2",
"meta": "My variant 2 meta information."
}
]
}
],
"materials": [
{
"pbrMetallicRoughness": {
},
"name": "My product material",
},
{
"pbrMetallicRoughness": {
"baseColorFactor": [ 1, 0, 0, 1 ]
},
"name": "My product alternative material",
"tag": "My variant 0",
"override": 0
},
{
"pbrMetallicRoughness": {
"baseColorFactor": [ 1, 1, 0, 1 ]
},
"name": "My product alternative material",
"tag": "My variant 1",
"override": 0
}
],
"meshes": [
{
"primitives": [
{
"attributes": {
"TEXCOORD_0": 0,
"NORMAL": 1,
"TANGENT": 2,
"POSITION": 3
},
"indices": 4,
"material": 0
}
],
"name": "My product mesh"
},
{
"primitives": [
{
"attributes": {
"TEXCOORD_0": 5,
"NORMAL": 6,
"TANGENT": 7,
"POSITION": 8
},
"indices": 9,
"material": 0
}
],
"name": "My product alternative mesh",
"tag": "My variant 1",
"override": 0
}
],
"nodes": [
{
"mesh": 0,
"name": "My product node"
},
{
"mesh": 0,
"translate": [0.0, 0.0, 1.0],
"name": "My product alternative node",
"tag": "My variant 2",
"override": 0
}
],
"scene": 0,
"scenes": [
{
"nodes": [
0
]
}
]
}
I have left out the extensions
etc. properties, that this can be more easily read:
products
At root level, all the products and it's variants are listed. Also, the place, to store any meta information - which can be huge - about the 3D model. Important is the tag
property. So, this product has a default material plus three variants. If wanted, the default material can be encoded as variant as well.
Then we do have meshes
, materials
and nodes
. Additional properties are tag
and override
. tag
is the name of the variant and override
is the index of the element in the array to change. So, an override in mesh, is chaninging the values in mesh, an override in materials is changing the values in materials and so on.
The advantage of this approach is, that if the extension is not supported, the default scene can still be rendered. Please note, that even multiple nodes, meshes etc. are defined, only the node 0 and it's mesh is rendered.
If the extension is supported, the renderer can toggle between the variants. If e.g. variant 1 is active, all properties and values, where tag variant 1 is mentioned, are changed.
One idea we may want to consider is that there may be a LOT of variants, in some cases large enough to warrant splitting the binary data per-variant or per group of variants. The non-GLB form of glTF allows multiple buffers, perhaps clients implementing this extension should be aware that they need to identify which variant they're using, and only request the buffer(s) that are applicable to that variant.
Some months later, we're about ready to start employing a simple, material-switching vendor extension (current draft) internally.
Ours is very simple –– mesh primitives can switch materials, that's it. What might go in a multi-vendor extension? (It still seems quite important to me that one materialise, lest we balkanise the space with vendor extensions).
There are excellent ideas in this thread; some minor, some major. Perhaps a summary might be useful, to refocus conversation. Have opinions shifted? Those of you who are involved in commerce endeavours, have presumptions or conclusions shifted?
CC: @debuggrl
@zellski I like the general idea you are proposing, but i'm wondering how it works in practice. Lets say i'm sending you (my customer) a shoe, and I want you to see the red variant. How would that be communicated to the end-user? I might send you a url, so I could imaging the tags being included in the URL. but what if I send you the file? how do I select the red variant so that you see thatn when you load it? Is there a default tag in the file also? As an extended example, red and yellow may be valid options in your region, but the blue may not be available (regional differences etc.). Does the mechanism provide a means for describing option sets?
@steveghee In my opinion, all such high-level considerations are the domain of the actual hosting application itself. Different use cases, or even different (e.g.) retailers, are all going to have wildly different environments for product identifiers, i18n descriptions, associated flat image links, prices, URL mappings -- a million things we would never want to include here.
Instead. a glTF loader that accepts this extension would just export the core functionality of low-latency runtime tag-based switching, along with some form of API to allow e.g. setEnabledTags()
or enableTag()
/disableTag()
and so forth. The extension should be as lean and as "orthogonisable" as possible.
It's possible it could be meaningful to include a simple root-level object that enumerates the tags actually used deeper in the JSON hierarchy, perhaps associated with a "name" field that could be more descriptive, but mostly so that general purpose loaders can easily present the user with a "debug" level UI.
I wonder if this is, in fact, as @debuggrl suggested above, an occasion where multiple extensions make sense - related ones, but distinct. Probably two of them.
The danger with overloading a single extension is obvious: it becomes rapidly more difficult for an engine to implement the extension well. If we require fluid interactivity (no 100+ ms freezes) and intelligent memory handling (data that is shared in the glTF asset should be shared data in memory), and we additionally require tagged material switching at the mesh primitive level and tagged scene graph manipulation at the node level, those combined requirements may well exceed what engines today can realistically accomplish without major refactoring.
If we separate the extensions, what's currently FB_material_variants
could probably graduate to EXT largely as-is. Then we could additionally consider an EXT_node_variants
extension that allows lights / cameras / meshes, i.e. anything that hangs off of a node, to be real-time mutable on a per-tag basis.
Most use cases we're considering today would require only the first extension. This includes those use cases where geometry does change, but it changes so substantially that we're better off using separate assets per geometry variant.
For use cases such as @MiiBond 's car spoiler above, where only a small detail changes geometry, the assets would flag themselves as requiring both extensions, and so whoever decides to go that route would first ensure there's an engine for them to use that implements both.
Finally, while I tend to think mostly in GLB terms, @emackey is of course right that we should take into account glTF's ability to selectively download separate buffers and textures. I think the extensions as outlined here do work well from that point of view. An application that knows it can split variant data out into separate buffers (and is okay with the latency involved in downloading them on demand) would just make different choices about how to use the extensions.
One more development here: we’ve been writing a tool (in Rust) that takes a set of glTF assets, all of which are structurally identical and vary only in materials used, and outputs a single new GLB that contains all the variations compactly, reusing any shared data.
It’s still in a pre-prerelease state, but in the next few weeks we will make a first alpha publicly available on GitHub, under MIT license.
At first it’ll be a barebones CLI tool, but it compiles to WebAssembly, and a web app version will run entirely in the browser.
We export using a FB_material_variants
extension (which we will PR here). I’m hoping we can get to an EXT or even KHR that’s sufficiently in scope for our project that we can switch to that later on.
@zellski I would like you to check this proposal: #1660
If I understand correctly, part of the problem of handling asset variations is that the scene hierarchy holds what needs to be rendered, and that makes things complicated for variations.
My proposal at #1660 decouples scene hierarchy, and presents a flat list of "things to render" for every scene, which would potentially simplify reusing resources between different scenes in a glTF.
@vpenades I can't really speak to the virtue of that proposal, but it looks too heavy-weight to me to be relevant for this discussion at this point in time. I think we need simple first steps that are easy for clients to opt and which cover the most realistic early use cases. This field – commerce specifically – is going to move very quickly, and we need to be light on our feet.
This was completed and released:
https://www.khronos.org/blog/streamlining-3d-commerce-with-material-variant-support-in-gltf-assets