Efficient Params and String Formatting
jaredpar opened this issue ยท 11 comments
- Proposal added
- Discussed in LDM
- Decision in LDM
- Finalized (done, rejected, inactive)
- Spec'ed
This is issue is related to the following:
LDM history:
Overload resolution rules will be changed to prefer
ValueFormattableString
overstring
when the argument is an interpolated string.
๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐
Can the Variant
be optimized with runtime support? So far, the Variant
is 32 byte struct because the .NET runtime can't over-wrap structs and objects. Is there any chance to avoid the extra 'object' field?
Great proposal!
Is there a link to a proposal, spec, or even code file for Variant
? I'd be interested in reading further about it. Being able to store heterogeneous data without boxing would be fantastic in the context of data frames.
@MgSam current variant code is here: dotnet/corefxlab#2595
Basically it is a wrapper around a union struct with some unsafe manipulation going on to get at the values without boxing as long as those values are from a small set of well known types.
I'm not sure I like the conversion operators on it but otherwise it is reasonable. As @morganbr points out some ok looking code gives perhaps unexpected results like (long)(Variant)(-1)
(as opposed to (long)(Variant)(-1l)
)
Variant2 and Variant3
Why not use ValueTuple<Variant, Variant, ...>
and have an extension method to create a Span<Variant>
from it?
When [the
IEnumerable<T>
variant is] invoked inT
argument form the backing storage will be allocated as aT[]
just asparams T[]
is done today.
If it was guaranteed by the language that params IEnumerable<T>
uses an array, wouldn't that prevent some optimizations?
As a somewhat contrived example, consider:
string Format(string format, params IEnumerable<string> args);
โฆ
foreach (var format in formats)
{
Format(format, "foo", "bar", "baz");
}
A smart compiler could allocate a single custom implementation of IEnumerable<string>
and use it for every iteration of the loop, if that was allowed. But if it used an array, it couldn't reuse it, because the called method could cast to array and then mutate it.
Though I'm not sure such optimization would ever be implemented in the compiler; it's certainly much less useful than the other optimizations that are proposed here.
readonly struct ValueFormattableString
Should ValueFormattableString
be a ref struct
, so that it could store the params ReadOnlySpan<Variant> collection
without allocations?
ValueFormattableString.Create("hello {0}", new Variant(DateTime.UtcNow))
Since this API is not meant for human consumption, could you consider approaches that avoid the cost of parsing the format string?
For example, the code: ValueFormattableString vfs = $"Weight: {weight,7:f1} kg";
could be compiled into something like (using stackalloc
as a shorthand for RuntimeIntrinsic.StackAlloc
):
ReadOnlySpan<string> texts = stackalloc[] { "Weight: ", " kg" };
ReadOnlySpan<Variant> args = stackalloc[] { (Variant)weight };
ReadOnlySpan<int?> alignments = stackalloc[] { (int?)7 };
ReadOnlySpan<string> formats = stackalloc[] { "f1" };
var vfs = ValueFormattableString.Create(texts, args, alignments, formats);
A smart compiler could allocate a single custom implementation of IEnumerable and use it for every iteration of the loop, if that was allowed.
Assuming the callee doesn't store the IEnumerable
for use after the method has returned, which is not an assumption you can easily make at a compiler level.
A smart compiler could allocate a single custom implementation of IEnumerable and use it for every iteration of the loop, if that was allowed.
Assuming the callee doesn't store the
IEnumerable
for use after the method has returned, which is not an assumption you can easily make at a compiler level.
Why would storing it be an issue? Iterating it will always return the same values, so multiple pieces of code (even multiple threads) iterating that IEnumerable
should be fine.
Oh right, because it's all constant values.