jakartaee/data

[TCK]: Inconsistent Enum Persistence Between Jakarta Persistence and Jakarta NoSQL Affects Query Behavior in TCK Tests

otaviojava opened this issue ยท 4 comments

Specification Version

1.0.0

Bug report

There is an inconsistency in how enums are persisted by default in Jakarta Persistence (JPA) and Jakarta NoSQL, which significantly impacts queries and the related TCK (Technology Compatibility Kit) tests that involve fields of enum type.

Issue Overview:

  • Jakarta Persistence (JPA):

    • By default, JPA persists enums as ordinal values (integers) in SQL databases. This means the position of the enum in its declaration (e.g., 0 for the first constant, 1 for the second, etc.) is stored in the database.
  • Jakarta NoSQL:

    • In contrast, Jakarta NoSQL persists enums as their string names by default in NoSQL databases like MongoDB. This means that the actual name of the enum constant (e.g., "ONE", "PRIME", "COMPOSITE") is stored.

Impact on Query Behavior:

This difference in storage strategies leads to inconsistencies in query results between SQL and NoSQL databases when dealing with enum fields. Specifically:

  • Sorting Behavior:

    • In SQL, sorting by an enum field will follow the ordinal (numerical) order. For example, ONE (0), PRIME (1), COMPOSITE (2).
    • In NoSQL, sorting by the same enum field will follow lexicographical (alphabetical) order. For example, "COMPOSITE", "ONE", "PRIME".
  • Filtering and Equality Checks:

    • Comparisons and equality checks against enum fields will behave differently depending on whether the field is stored as an ordinal or a string, potentially leading to unexpected results in TCK tests.

Solution

That is a classic case for test specialization; thus, I could create tests to verify it on the NoSQL side; we already have a package for NoSQL. The goal could be to use the NoSQLEntityTests for this.

I mean, that's why we say explicitly:

This specification does not define an order for the sorting of Java enum values, which is provider-dependent. A programming model for entity classes might allow control over the order of enum values.

Thus, I only need to change the TCK;
@njr-11
please let me know if it is ok to create those tests at NoSQLEntityTests.

Oh OK I misunderstood.

It looks like TCK tests were never updated accordingly after the decision was made to have order of enums be vendor-specific. I see at least 4 tests that depend on a NumberType attribute to be ordered in a particular way. Creating NoSQL-specific tests isn't the right answer. The TCK would still be invalid for anything else that isn't JPA. JPA by the way can configure how the enum value is persisted and could be made to use a String instead, but that isn't the right answer either due to what the spec says. I'm thinking the least intrusive way to correct this will be to duplicate the NumberType attribute on the entity into two attributes where one is the enum (preserving most tests as is because they don't order by it) and another is a numeric constant value that can be used to consistently order in the way that tests already expect. I can look into this on Monday.