Efficient Params and String Formatting

Question

Efficient Params and String Formatting

jaredpar opened this issue 6 years ago · 11 comments

This is issue is related to the following:

LDM history:

Answer 1 · 2019-03-05T19:47:51.000Z

Overload resolution rules will be changed to prefer ValueFormattableString over string when the argument is an interpolated string.

🎆 🎉 🎆 🎉 🎆 🎉 🎆 🎉 🎆

Answer 2 · 2019-03-06T01:48:04.000Z

Can the Variant be optimized with runtime support? So far, the Variant is 32 byte struct because the .NET runtime can't over-wrap structs and objects. Is there any chance to avoid the extra 'object' field?

Answer 3 · 2019-03-06T13:42:55.000Z

Great proposal!

Is there a link to a proposal, spec, or even code file for Variant? I'd be interested in reading further about it. Being able to store heterogeneous data without boxing would be fantastic in the context of data frames.

Answer 4 · 2019-03-06T16:21:18.000Z

@MgSam current variant code is here: dotnet/corefxlab#2595

Basically it is a wrapper around a union struct with some unsafe manipulation going on to get at the values without boxing as long as those values are from a small set of well known types.

I'm not sure I like the conversion operators on it but otherwise it is reasonable. As @morganbr points out some ok looking code gives perhaps unexpected results like (long)(Variant)(-1) (as opposed to (long)(Variant)(-1l))

Answer 5 · 2019-03-07T03:46:45.000Z

Variant2 and Variant3

Why not use ValueTuple<Variant, Variant, ...> and have an extension method to create a Span<Variant> from it?

Answer 6 · 2019-04-28T22:55:30.000Z

When [the IEnumerable<T> variant is] invoked in T argument form the backing storage will be allocated as a T[] just as params T[] is done today.

If it was guaranteed by the language that params IEnumerable<T> uses an array, wouldn't that prevent some optimizations?

As a somewhat contrived example, consider:

string Format(string format, params IEnumerable<string> args);

…

foreach (var format in formats)
{
    Format(format, "foo", "bar", "baz");
}

A smart compiler could allocate a single custom implementation of IEnumerable<string> and use it for every iteration of the loop, if that was allowed. But if it used an array, it couldn't reuse it, because the called method could cast to array and then mutate it.

Though I'm not sure such optimization would ever be implemented in the compiler; it's certainly much less useful than the other optimizations that are proposed here.

readonly struct ValueFormattableString

Should ValueFormattableString be a ref struct, so that it could store the params ReadOnlySpan<Variant> collection without allocations?

ValueFormattableString.Create("hello {0}", new Variant(DateTime.UtcNow))

Since this API is not meant for human consumption, could you consider approaches that avoid the cost of parsing the format string?

For example, the code: ValueFormattableString vfs = $"Weight: {weight,7:f1} kg"; could be compiled into something like (using stackalloc as a shorthand for RuntimeIntrinsic.StackAlloc):

ReadOnlySpan<string> texts = stackalloc[] { "Weight: ", " kg" };
ReadOnlySpan<Variant> args = stackalloc[] { (Variant)weight };
ReadOnlySpan<int?> alignments = stackalloc[] { (int?)7 };
ReadOnlySpan<string> formats = stackalloc[] { "f1" };

var vfs = ValueFormattableString.Create(texts, args, alignments, formats);

Answer 7 · 2019-04-29T00:55:03.000Z

A smart compiler could allocate a single custom implementation of IEnumerable and use it for every iteration of the loop, if that was allowed.

Assuming the callee doesn't store the IEnumerable for use after the method has returned, which is not an assumption you can easily make at a compiler level.

Answer 8 · 2019-04-29T01:19:10.000Z

@yaakov-h

A smart compiler could allocate a single custom implementation of IEnumerable and use it for every iteration of the loop, if that was allowed.

Assuming the callee doesn't store the IEnumerable for use after the method has returned, which is not an assumption you can easily make at a compiler level.

Why would storing it be an issue? Iterating it will always return the same values, so multiple pieces of code (even multiple threads) iterating that IEnumerable should be fine.

Answer 9 · 2019-04-29T04:49:30.000Z

Oh right, because it's all constant values.

Answer 10 · 2019-05-09T04:52:45.000Z

Is this related to #535 ?

Answer 11 · 2022-09-26T17:51:45.000Z

Closing as this proposal has been broken off into three different items that are being pursued.