dotnet/csharplang

Efficient Params and String Formatting

jaredpar opened this issue ยท 11 comments

Overload resolution rules will be changed to prefer ValueFormattableString over string when the argument is an interpolated string.

๐ŸŽ† ๐ŸŽ‰ ๐ŸŽ† ๐ŸŽ‰ ๐ŸŽ† ๐ŸŽ‰ ๐ŸŽ† ๐ŸŽ‰ ๐ŸŽ†

ufcpp commented

Can the Variant be optimized with runtime support? So far, the Variant is 32 byte struct because the .NET runtime can't over-wrap structs and objects. Is there any chance to avoid the extra 'object' field?

MgSam commented

Great proposal!

Is there a link to a proposal, spec, or even code file for Variant? I'd be interested in reading further about it. Being able to store heterogeneous data without boxing would be fantastic in the context of data frames.

@MgSam current variant code is here: dotnet/corefxlab#2595

Basically it is a wrapper around a union struct with some unsafe manipulation going on to get at the values without boxing as long as those values are from a small set of well known types.

I'm not sure I like the conversion operators on it but otherwise it is reasonable. As @morganbr points out some ok looking code gives perhaps unexpected results like (long)(Variant)(-1) (as opposed to (long)(Variant)(-1l))

qrli commented

Variant2 and Variant3

Why not use ValueTuple<Variant, Variant, ...> and have an extension method to create a Span<Variant> from it?

svick commented

When [the IEnumerable<T> variant is] invoked in T argument form the backing storage will be allocated as a T[] just as params T[] is done today.

If it was guaranteed by the language that params IEnumerable<T> uses an array, wouldn't that prevent some optimizations?

As a somewhat contrived example, consider:

string Format(string format, params IEnumerable<string> args);

โ€ฆ

foreach (var format in formats)
{
    Format(format, "foo", "bar", "baz");
}

A smart compiler could allocate a single custom implementation of IEnumerable<string> and use it for every iteration of the loop, if that was allowed. But if it used an array, it couldn't reuse it, because the called method could cast to array and then mutate it.

Though I'm not sure such optimization would ever be implemented in the compiler; it's certainly much less useful than the other optimizations that are proposed here.


readonly struct ValueFormattableString

Should ValueFormattableString be a ref struct, so that it could store the params ReadOnlySpan<Variant> collection without allocations?


ValueFormattableString.Create("hello {0}", new Variant(DateTime.UtcNow))

Since this API is not meant for human consumption, could you consider approaches that avoid the cost of parsing the format string?

For example, the code: ValueFormattableString vfs = $"Weight: {weight,7:f1} kg"; could be compiled into something like (using stackalloc as a shorthand for RuntimeIntrinsic.StackAlloc):

ReadOnlySpan<string> texts = stackalloc[] { "Weight: ", " kg" };
ReadOnlySpan<Variant> args = stackalloc[] { (Variant)weight };
ReadOnlySpan<int?> alignments = stackalloc[] { (int?)7 };
ReadOnlySpan<string> formats = stackalloc[] { "f1" };

var vfs = ValueFormattableString.Create(texts, args, alignments, formats);

A smart compiler could allocate a single custom implementation of IEnumerable and use it for every iteration of the loop, if that was allowed.

Assuming the callee doesn't store the IEnumerable for use after the method has returned, which is not an assumption you can easily make at a compiler level.

svick commented

@yaakov-h

A smart compiler could allocate a single custom implementation of IEnumerable and use it for every iteration of the loop, if that was allowed.

Assuming the callee doesn't store the IEnumerable for use after the method has returned, which is not an assumption you can easily make at a compiler level.

Why would storing it be an issue? Iterating it will always return the same values, so multiple pieces of code (even multiple threads) iterating that IEnumerable should be fine.

Oh right, because it's all constant values.

Is this related to #535 ?

Closing as this proposal has been broken off into three different items that are being pursued.