entity coherence checking tool
Closed this issue · 4 comments
Python script to check orion DB entity inconsistencies. In particular, the following rules would be taken into account:
Rules to check:
- Rule 10:
_id
field consistency - Rule 11: mandatory fields in entity
- Rule 12: mandatory fields in attribute
- Rule 13:
attrNames
field consistency - Rule 14:
mdNames
field consistency - Rule 15: not swapped subkeys in
_id
- Rule 16:
location
field consistency - Rule 17:
lastCorrelator
existence - Rule 20: entity id syntax
- Rule 21: entity type syntax
- Rule 22: entity servicePath syntax
- Rule 23: attribute name syntax
- Rule 24: attribute type syntax
- Rule 25: metadata name syntax
- Rule 26: metadata type syntax
- Rule 90: detect usage of
geo:x
attribute type wherex
different fromjson
- Rule 91: detect usage of more than one legacy
location
metadata - Rule 92: detect legacy
location
metadata should beWGS84
orWSG84
- Rule 93: detect usage of redundant legacy
location
- Rule 94: detect usage of not redundant legacy
location
For Rules 20-26: https://github.com/telefonicaid/fiware-orion/blob/master/doc/manuals/orion-api.md#identifiers-syntax-restrictions
Entity datamodel: https://github.com/telefonicaid/fiware-orion/blob/master/doc/manuals/admin/database_model.md#entities-collection
The new script make the following ones obsolete:
- https://github.com/telefonicaid/fiware-orion/blob/master/scripts/managedb/check_entities_consistency.py
- https://github.com/telefonicaid/fiware-orion/blob/master/scripts/managedb/check_location_coherence.py
Btw, they are pretty old and use Python 2.x so it is a good idea to remove both
The check_legacy_location_metadata.py script in PR #4048 could be also integrated in the new script.
Rule 15 is special, as it is not individually checked on every entity. The following aggregation pipeline will help to detect violations:
db.entities.aggregate([{$group: { _id: {"id": "$_id.id", "type": "$_id.type", "servicePath": "$_id.servicePath"}, count: {$sum: 1} }}, {$match: {count: {$gt: 1}}}])