Get scheme by id to use constant time lookup
Opened this issue · 1 comments
mpkocher commented
There's a core method call of get_schema_by_id
which is doing an O(N) call.
class ESSE(object):
"""
Exabyte Source of Schemas and Examples class.
"""
def __init__(self):
self.schemas = SCHEMAS
self.examples = EXAMPLES
def get_schema_by_id(self, schemaId):
return next((s for s in SCHEMAS if s.get("schemaId") == schemaId), None)
While parsing in libs like Exabtye's express
are probably limited by file parsing IO and N is small here (~200), get_schema_by_id
is called from serialize_and_validate
on every property. The call can be converted to a O(1) lookup with a minor change.
class ESSE(object):
def __init__(self):
self.schemas = SCHEMAS
self._schemas = {s['schemaId']: s for s in self.schemas if s.get('schemaId') is not None}
self.examples = EXAMPLES
def get_schema_by_id(self, schemaId):
return self._schemas.get(schemaId)
timurbazhirov commented
Just a quick note - thanks for this helpful suggestion, Michael! We'll review and plan to schedule this for inclusion in the next release (later in Q1 or early Q2).