jmsv/ety-python

Circular origin reference

alxwrd opened this issue · 3 comments

There's currently at least one circular reference in etymwn-relety.json.

ety.origins("software", recursive=True)

will eventually fail with a recursion error because software -> soft, -ware and -ware -> software.

I'm going to run a job to try and discover all these problems with the data.

Hidden as no longer relevant

import sys
import ety

sys.setrecursionlimit(20)

total = len(ety.data.etyms)

def find():
    results = []
    errors = []
    for count, word in enumerate(ety.data.etyms):
        try:
            _ = ety.Word(word["a_word"], word["a_lang"]).origins()
            print("{}/{}".format(count, total), end="\r")
        except RecursionError:
            results.append(_)
        except Exception as e:
            errors.append({
                "error": e,
                "word": _
                })

    return results, errors
>>> import find_circulars
>>> results, errors = find_circulars.find()
4094/473433


I'll update here once it's done.

I've just realised this isn't actually recursion because it's the child's .origins() that's being called.

The issue is because the results are appended to the result list, and the chain "software" -> "-ware" -> "software" will just keep growing the result list.

I have a fix in mind, I'll submit a PR later.

jmsv commented

Hmm, could the solution be as simple as only adding a Word to a branch if it hasn't already appeared in that branch?