gbif/pipelines

Adapt clustering to use the newly added multivalue fields

marcos-lg opened this issue · 1 comments

The issue #665 brought some new interpreted fields and changed the typeStatus from string to array.

Some of the new fields added were used before as strings because they were being carried from the verbatim values. But now they are interpreted fields in the basic record.

You can see the changes done in the avro schemas here.

Clustering needs to be adapted to these changes to either use arrays or convert the arrays into strings.

Thanks - this will be a simple change, and should not hold up #665 from going live.