nicolay-r/AREkit

DocumentEntityID Assignation -- code parts are implicitly sinchronized with implementation

nicolay-r opened this issue · 1 comments

How to fix: So in a snippet №2 we can use the entities mapping.

The reason is because the code below executed before:

def init_parsed_news(self, parsed_news):
assert(isinstance(parsed_news, ParsedNews))
self._doc_entities = []
self.__entity_map.clear()
for index, entity in enumerate(parsed_news.iter_entities()):
doc_entity = DocumentEntity(id_in_doc=index,
value=entity.Value,
e_type=entity.Type,
display_value=entity.DisplayValue,
group_index=entity.GroupIndex)
self._doc_entities.append(doc_entity)
if self.__entity_index_func is not None:
self.__entity_map[self.__entity_index_func(entity)] = doc_entity

and then followed by this function in the inherited class:

def __calculate_entity_positions(self):
""" Note: here we consider the same order as in self._entities.
"""
positions = []
t_ind_in_doc = 0
for s_ind, t_ind_in_sent, term in self.__iter_raw_terms_func():
if isinstance(term, Entity):
position = TermPosition(term_ind_in_doc=t_ind_in_doc,
term_ind_in_sent=t_ind_in_sent,
s_ind=s_ind)
positions.append(position)
t_ind_in_doc += 1
return positions

Both are about assigning identifiers.
However, when it comes to nested entities, this is expected to be sinchronized in a single place

Fixed in 8c1ee11