apache/uima-uimaj

COMPRESSED_FILTERED_TSI produces different output on different JDKs

reckart opened this issue · 1 comments

Describe the bug
COMPRESSED_FILTERED_TSI produces different output on different JDKs or at least on different systems.

To Reproduce
Steps to reproduce the behavior:
Run CasSerializationDeserialization_COMPRESSED_FILTERED_TSI_Test.serializeAndCompareToReferenceTest in particular test 12 (casWithLists) and 13 (casWithArrays) on different JDKs:

  • ❌ MacOS
    openjdk version "1.8.0_252"
    OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_252-b09)
    OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.252-b09, mixed mode)
  • ✅ Mac OS
    openjdk version "1.8.0_332"
    OpenJDK Runtime Environment (Zulu 8.62.0.19-CA-macosx) (build 1.8.0_332-b09)
    OpenJDK 64-Bit Server VM (Zulu 8.62.0.19-CA-macosx) (build 25.332-b09, mixed mode)
  • ❌ Linux
    Azul_JDK_8/zulu8.62.0.19-ca-jdk8.0.332-linux_x64

Expected behavior
The format should be consistent across deployments.

Screenshots

Please complete the following information:

  • Version: 3.4.0-SNAPSHOT

Interestingly, the deserialized CASes seem to be equivalent though.

Maybe the difference is related to differences or non-deterministic behaviour in the compression or in the iteration order through the feature structures.

class Investigate {
  @Test
  void test() throws Exception {
    CAS ref = CasCreationUtils.createCas();
    CAS act = CasCreationUtils.createCas();

    CasIOUtils.load(new File("actual-cas.bin").toURL(), act);
    CasIOUtils.load(new File("reference-cas.bin").toURL(), ref);

    assertThat(toComparableString(act)).isEqualTo(toComparableString(ref));
  }
}