CommunityToolkit/Lottie-Windows

BOM Gone?

r2d2Proton opened this issue · 1 comments

The best I can tell, in Lottie-Windows Loader.cs, StorageFIleLoader.cs, LottieCompositionReader.cs there is an effort to process different UTF and non-UTF files:

public static LottieComposition? ReadLottieCompositionFromJsonStream(Stream stream, Options options, out IReadOnlyList<(string Code, string Description)> issues)
{
    ReadStreamToUTF8(stream, out var utf8Text);
    return ReadLottieCompositionFromJson(utf8Text, options, out issues);
}

static void ReadStreamToUTF8(Stream stream, out ReadOnlySpan<byte> utf8Text)
{
    // This buffer size is chosen to be about 50% larger than
    // the average file size in our corpus, so most of the time
    // we don't need to reallocate and copy.
    var buffer = new byte[150000];
    var bytesRead = stream.Read(buffer, 0, buffer.Length);
    var spaceLeftInBuffer = buffer.Length - bytesRead;

    while (spaceLeftInBuffer == 0)
    {
        // Might be more to read. Expand the buffer.
        var newBuffer = new byte[buffer.Length * 2];
        spaceLeftInBuffer = buffer.Length;
        var totalBytesRead = buffer.Length;
        Array.Copy(buffer, 0, newBuffer, 0, totalBytesRead);
        buffer = newBuffer;
        bytesRead = stream.Read(buffer, totalBytesRead, buffer.Length - totalBytesRead);
        spaceLeftInBuffer -= bytesRead;
    }

    utf8Text = new ReadOnlySpan<byte>(buffer);
    NormalizeTextToUTF8(ref utf8Text);
}

static void NormalizeTextToUTF8(ref ReadOnlySpan<byte> text)
{
    if (text.Length >= 1)
    {
        switch (text[0])
        {
            case 0xEF:
                // Possibly start of UTF8 BOM.
                if (text.Length >= 3 && text[1] == 0xBB && text[2] == 0xBF)
                {
                    // UTF8 BOM. Step over the UTF8 BOM.
                    text = text.Slice(3, text.Length - 3);
                }
                break;  
        }
    }
}

The best I can tell, when loading UTF-8 files with:

var filePicker = new FileOpenPicker{};
StorageFile? file = await filePicker.PickSingleFileAsync();

The BOM has already been eaten by a function before this is called. The beginning of the buffer is the start of the "{"JSON.

Simplified version:

static void ReadStreamToUTF8(Stream stream, out ReadOnlySpan<byte> utf8Text)
{
    // This buffer size is chosen to be about 50% larger than the average file size in our corpus, so most of the time
    var buffer = new byte[stream.Length];
    var bytesRead = stream.Read(buffer, 0, buffer.Length);
    utf8Text = new ReadOnlySpan<byte>(buffer);
    NormalizeTextToUTF8(ref utf8Text);
}

Also, please note the Lottie file I am testing with happens to be 1,812,872 bytes. Many others though are less than 100KB. Doing a check of more. . .

And other files in the same folder are larger than the 150KB allocated above (793 KB, 329 KB, 259 KB, 257 KB, . . . , 223 KB).

Another at 2,136,832 bytes