raysan5/raylib

[rtext] Font loading and export, codepoints never filled with unique entries.

designerfuzzi opened this issue · 3 comments

Using the ExportFontAsCode(font, "/Users/user/project/FontTestExport.h"); function noticed that my codepoints array is never filled with unique entries but rather with any character encountered.
So looked into raylib fn LoadCodepoints() and indeed it stores any character even if given multiple times, which means they all get loaded or exported as often they appear in the array, multiple times. Thinking the fn was not intended to be used on plain utf8 encoded text but on selective collections of characters. Then started searching for an appropriate function out of raylib that would really just expose unique codepoints and could not find one.

Am i doing it wrong or is it indeed a missing feature?

Meanwhile wrote down a function supporting unique codepoints. Meant to really just load characters in fonts that are needed by getting rid of dublicates when creating codepoints. So that Re-allocate buffer makes even sense, in the former LoadCodepoints Function it would just reallocate but with the same size as before, as it never gets rid of dublicates.

bool ContainsCodepoint(int *codepoints, int count, int search) {
    //TODO: might need codepoints==NULL check here.
    for (int has=count; has>0; has--) {
        if (codepoints[has]==search) return true;
    }
    return false;
}
int *LoadUniqueCodepoints(const char *text, int *count)
{
    int textLength = TextLength(text);

    int codepointSize = 0;
    int codepointCount = 0;

    // Allocate a big enough buffer to store as many codepoints as text bytes
    int *codepoints = (int *)RL_CALLOC(textLength, sizeof(int));

    for (int i = 0; i < textLength; i += codepointSize)
    {
        int uni = GetCodepointNext(text + i, &codepointSize);
        if (ContainsCodepoint(codepoints,codepointCount,uni)) continue;
        codepoints[codepointCount] = uni;
        codepointCount++;
    }

    // Re-allocate buffer to the actual number of codepoints loaded
    int *temp = (int *)RL_REALLOC(codepoints, codepointCount*sizeof(int));
    if (temp != NULL) codepoints = temp;

    *count = codepointCount;

    return codepoints;
}

please test.
For me this suggested fn improves quite a lot the loading process.
Example text "Hello World" contains already 3x"l". It should load only once the "l".

@designerfuzzi Actually that's the expected behaviour. The function is intended to convert an UTF-8 string into a Codepoint array.

You can check text_codepoints_loading example for an example on duplicates removal. That's up to the user for now but maybe at some point I add that function to raylib.

hmm. Maybe then the documentation line should express this behaviour. Yet it makes not much sense to allow fonts to load in "Hello World" the "l" three times by default. Thanks for the reply. So i'l go with a local implementation for now.

@designerfuzzi I think the documentation is clear on its behaviour:

// Load all codepoints from a UTF-8 text string, codepoints count returned by parameter

It does not mention any filtering or duplicates removal.