MarquezProject/marquez

[PERF] DatasetDAO findAll query fail if there are too much facet

sophiely opened this issue · 0 comments

When namespace has too much dataset and dataset facets we can face this error:

{"code":500,"message":"org.postgresql.util.PSQLException: ERROR: total size of jsonb array elements exceeds the maximum of 268435455 bytes [statement:"/* DatasetDao.findAll / SELECT d., dv.fields, dv.lifecycle_state, sv.schema_location, t.tags, facets\nFROM datasets_view d\nLEFT JOIN dataset_versions dv ON d.current_version_uuid = dv.uuid\nLEFT JOIN stream_versions AS sv ON sv.dataset_version_uuid = dv.uuid\nLEFT JOIN ...

Here is the current query:

image

If we zoom in the column facet we can see these values:

[
{
"schema": {...}
},
{
"dataSource": {...},
{
"dataSource": {...},
{
"schema": {...}
},
{
"dataSource": {...},
{
"schema": {...}
},
{
"schema": {...}
},
{
"dataSource": {...}
}
]

For each version uuid we have the same facet type replicated.