meltano/sdk

bug: TypeConformance.Recursive doesn't remove Object or Array of Objects properties from the schema when they exist in the Record but not the Schema

visch opened this issue · 0 comments

Singer SDK Version

0.34.1

Is this a regression?

  • Yes

Python Version

3.9

Bug scope

Taps (catalog, state, etc.)

Operating System

Windows, and Linux

Description

From a users perspective it looks as if our users don't have any identities or a manager assigned. Even though they do and the tap is just dumping the information.

To reproduce:

  1. Have a tap return an Object that doesn't have all of the sub properties defined

Properties (''identities.issuer', 'identities.issuerAssignedId', "manager.Id") were present in the 'users' stream but not found in catalog schema. Ignoring.
Example Schema is below.

{"type": "SCHEMA", "stream": "users", "schema": {"properties": {"id": {"type": ["string", "null"]},  "identities": {"items": {"properties": {}, "type": "object"}, "type": ["array", "null"]}, "manager": {"properties": {}, "type": ["object", "null"]}}, "type": "object"}, "key_properties": ["id"]}

Technical Description

The issue here is that our target will still create the columns identities and manager as they exist in the schema. In reality if Type Conformance is going to not send the data for either of those columns then we shouldn't have a schema set for that column.

Simple fix for me here is to just drop the Conformance Level to Root
TYPE_CONFORMANCE_LEVEL = TypeConformanceLevel.ROOT_ONLY but because the columns were still created in my destination it really sent me for a loop burning a few hours.