jyrkioraskari/IFCtoLBD

Strange handling of UTF-8 characters

Closed this issue · 1 comments

Several characters get wrongly encoded somewhere in the the Revit-to-IFC-to-LBD conversion.

Apologies if it's not related to IFC-to-LBD.

Documented as part of Isaac Fatokun, Arun Raveendran Nair, Thamer Mecharnia, Maxime Lefrançois, Victor Charpenay, Fabien Badeig and Antoine Zimmermann, (2023) "Modular Knowledge integration for Smart Building Digital Twins", LDAC 2023

To Reproduce

See https://github.com/maximelefrancois86/databat-kgc-revit for steps to reproduce the issue

Expected behavior

For example: ”\n” is encoded as ”\X\0D\X\0A”, ”é” is encoded as ”\X\00E9\X\0A”.

Versions

  • OS: Windows 11
  • Revit version 2021
  • IFC exporter version unknown
  • IFC to LBD version 2.39.0

The encoding can be seen already in the IFC STEP files:
#195540= IFCOPENINGELEMENT('0bCEnpm9XD2wi1dHaVsOsO',#48,'Basic Wall:G\X2\00E9\X0\n\X2\00E9\X0\rique - B\X2\00E9\X0\ton 20 cm:172807',$,$,#195539,#195533,'860366',.OPENING.);

Decoding of the lines was added to the converter. It will be available in the next release.