syoyo/tinygltf

model.nodes[idx].name returned wrong string when name is unicode string?

pigLoveRabbit520 opened this issue · 6 comments

Describe the issue

TinyGLTF loader;
std::string err;
std::string warn;

auto fileExt = std::filesystem::path(filePath).extension();
auto realExt = fileExt.generic_string();
loader.LoadASCIIFromFile(&model, &err, &warn, filePath);
for (auto idx: model.scenes[0].nodes)
{
    std::cout << model.nodes[idx].name << std::endl;
}

"\u7ec4" should show 组, however, It showed something else.
image

To Reproduce

  • windows 10
  • vs 2022

gltf

{
    "asset" : {
        "generator" : "Khronos glTF Blender I/O v1.8.19",
        "version" : "2.0"
    },
    "scene" : 0,
    "scenes" : [
        {
            "name" : "Scene",
            "nodes" : [
                3
            ]
        }
    ],
    "nodes" : [
        {
            "mesh" : 0,
            "name" : "1",
            "rotation" : [
                0,
                -1,
                0,
                1.9470718370939721e-07
            ],
            "translation" : [
                -134.16050720214844,
                25.878971099853516,
                -2.323408352822298e-06
            ]
        },
        {
            "mesh" : 1,
            "name" : "2",
            "rotation" : [
                0,
                -1,
                0,
                1.9470718370939721e-07
            ],
            "translation" : [
                -134.16050720214844,
                -13.597759246826172,
                -2.603506572995684e-06
            ]
        },
        {
            "mesh" : 2,
            "name" : "3",
            "rotation" : [
                0,
                -1,
                0,
                1.9470718370939721e-07
            ],
            "translation" : [
                -134.16050720214844,
                3.210163116455078,
                -2.657505319803022e-06
            ]
        },
        {
            "children" : [
                0,
                1,
                2
            ],
            "name" : "\u7ec4",
            "scale" : [
                0.0010000000474974513,
                0.0010000000474974513,
                0.0010000000474974513
            ],
            "translation" : [
                0.1340475082397461,
                -0.00315132737159729,
                0
            ]
        }
    ],
  ...
}
syoyo commented

It prints finely in loader_example on Linux + nlohmann JSON backend: name : 组

Probably you are:

  • Compiling code with MBCS. Need to change to use UNICODE
  • Console uses non-unicode codepage: Run chcp to check the codepage you are using.

I already use UNICODE in vc settings, I found the name contained three bytes and the value is negative:
image

I have tested on Ubuntu, same code, I could get the right result:
image

syoyo commented

UTF8 CJK character is usually represented in three bytes.

tinygltf simply uses std::string returned by JSON backend, so you should first check if what will happen if you process JSON file with json.hpp(or RapidJson if you use RapidJSON).

@syoyo it seems the bug of the nlohmann/json..