Generalized msdf generation with (stochastic?) optimization.
Closed this issue · 7 comments
I have been using msdfgen a bit, on a variety of different kinds of paths and shapes.
In general I am satisfied with its performance, but the analytic approach to producing MSDFs seems to run into five-PhD edge cases, and its performance degrades rapidly as you add segments to your path.
I think that an optimization based approach could solve all of these problems at once, and can take place primarily on the GPU if desired.
The general idea in my head is:
- Generate a baseline MSDF (e.g. all points outside the shape, all points inside the shape, white noise, or copy a conventional SDF).
- Generate initial candidate permutations on the SDF samples.
- Compare the reference samples in the area of effect of the permuted SDF sample, to the result of the standard shading operation at that point.
- Use the prior values for surrounding SDF samples, only testing the current permutation.
- When calculating the shaded result, use the same linear interpolation between SDF samples, just as it will be with the final texture.
- The reference can be a small tile or subset of the whole shape, since the impact of a given SDF sample is limited to a radius of px_range. This means the whole reference image doesn't necessarily need to be in memory at full resolution, nor does the whole output need to be optimized at the same time.
- Select the winning permutation based on an error metric, optionally persist a subset of the runners-up from the current round into the next.
- Integrate selected permutations.
- Exit if error metric is satisfactory.
- Generate new candidate permutations.
- GOTO 3
An additional benefit of this approach is that it works on arbitrary input images, and can produce soft regions instead of hard aliasing when the SDF resolution is too low. This is particularly helpful for shapes with many small holes, where the analytical approach is inefficient and error-prone.
I will go ahead and try my hand at this problem; I'm putting up this issue to track ideas and conversations about this, and gather ideas in case anyone else here has thought about this.
For some concrete examples of problems I'm looking to solve, let's look at this example shape.
I have an example shape which I am loading as an SVG: (forgive me for pdf, GitHub doesn't let me attach it here)
viktor.pdf
I generate an MTSDF with these parameters: msdfgen mtsdf -svg ~/rnd/viktor.svg -size 512 512 -pxrange 37 -autoframe
, and that results in this:
That generation takes 6:15 on a desktop Zen+ processor
This MSDF produces errors immediately, even when rendering at the scale of the texture, which can be seen more clearly at magnification.
For example:
islands appear between sharp corners on opposing terminals
and complex arrangements of "spokes" become irregular and wavy instead of just blurry.
While resolving these issues at the same output resolution will not make the rendering perfect, there is probably a way to make the artifacts less destructive.
Good luck with your research. By 6:15, I hope you mean 6 seconds and not 6 minutes, but yeah, the program is not optimized to be used for many isolated shapes in this way, but more typically, a distance field would be generated for each glyph separately.
Thin lines and narrow gaps are a big issue for discretely sampled distance fields in general, but using the new option -coloringstrategy distance
actually fixes many of these errors, although in this case it takes a very long time, because shortest distances between all pairs of edges are computed. Still, the case of many edges meeting close to each other as within the asterisks is not solved by a different edge coloring, at least not without adding more color channels.
By 6:15, I hope you mean 6 seconds and not 6 minutes
🙊
Thin lines and narrow gaps are a big issue for discretely sampled distance fields in general
Yeah, there is no free lunch, and I'm not trying to discover a proof against the Whittaker–Nyquist–Shannon theorem here.
but using the new option
-coloringstrategy distance
actually fixes many of these errors, although in this case it takes a very long time, because shortest distances between all pairs of edges are computed. Still, the case of many edges meeting close to each other as within the asterisks is not solved by a different edge coloring, at least not without adding more color channels.
I did also look at more channels, that's tough since selecting one of three channels is easier than selecting one of four, though maybe the same operations work fine on four channels (that would completely resolve these four-corner conflicts, if it worked).
I was thinking more along the lines of 5 or 7 channels, where the same algorithm (taking the middle value) can be applied. This has the obvious downside of not being representable by a single texel, so a special mapping and a shader that takes two samples would have to be used, but I think there could be some very specialized use cases where this would actually be a good solution.
Five channels seems pretty reasonable overall; at least two three- or four-component textures and samplers in a fragment is the norm anyhow, in most work GPUs do; texture samplers are kinda built for this. The question is, does this win over just doubling the number of texels and sticking to three channels? I guess it would depend on the shape.
So, before working on the new generation method, I wanted to see if I could tweak the shader to resolve some of the artifacts. Unsurprisingly, clamping the rgb median to be near the true SDF sample (on MTSDF) causes only a minor change to the rendering of extreme acute and reflex angles, and removes most of those unsightly little voids in the fills, and those islands in the whitespaces.
Here's my baseline fragment shader (in WGSL)
[[group(0), binding(0)]]
var t_mtsdf: texture_2d<f32>;
[[group(0), binding(1)]]
var s_mtsdf: sampler;
fn median3(r: f32, g: f32, b: f32) -> f32 {
return max(min(r, g), min(max(r, g), b));
}
[[stage(fragment)]]
fn fs_main(in: VertexOutput) -> [[location(0)]] vec4<f32> {
var px_range: f32 = 37.0;
var fg_color: vec4<f32> = vec4<f32>(0.0, 0.0, 0.0, 1.0);
var bg_color: vec4<f32> = vec4<f32>(1.0, 1.0, 1.0, 1.0);
var unitRange: vec2<f32> = vec2<f32>(px_range) / vec2<f32>(textureDimensions(t_mtsdf));
var screenTexSize: vec2<f32> = vec2<f32>(1.0) / fwidth(in.tex_coords);
var screenPxRange: f32 = max(0.5 * dot(unitRange, screenTexSize), 1.0);
var samp: vec4<f32> = textureSample(t_mtsdf, s_mtsdf, in.tex_coords);
var sd: f32 = median3(samp.r, samp.g, samp.b);
var screenPxDistance: f32 = screenPxRange * (sd - 0.5);
var opacity: f32 = clamp(screenPxDistance + 0.5, 0.0, 1.0);
return mix(bg_color, fg_color, opacity);
}
This produces this output with the familiar artifacts:
When I change out the screenPxDistance calculation for
var screenPxDistance: f32 = screenPxRange * clamp(sd - 0.5, samp.a - 0.508, samp.a - 0.492);
which is much better for this shape. Probably better general factors than this could be chosen, maybe they depend on the shape; and maybe this could be applied before shading (I think so?) and work with normal MSDF images. It is possible to select parameters for that clamp which are conservative enough that they correct those artifacts while not causing any new visible artifacts at scales lower than you start to see the effect of linear texture interpolation.