SekouD/mlconjug

English Irregular verbs do not seem to conjugate properly

NoelHVincent opened this issue · 2 comments

This seems related to issue #77

I manually compiled a list of irregular past participle verbs:
['arisen', 'awoken', 'been', 'born', 'beaten', 'become', 'begun', 'bent', 'bet', 'bound', 'bitten', 'bled', 'blown', 'broken', 'bred', 'brought','broadcast', 'built', 'burnt', 'burst', 'bought', 'been able to', 'caught', 'chosen', 'clung', 'come', 'cost', 'crept', 'cut', 'delt', 'dug', 'done', 'dreamt', 'drunk', 'driven', 'eaten', 'fallen', 'fed', 'felt', 'fought', 'found', 'flown', 'forbidden', 'forgotten', 'forgiven', 'frozen', 'got', 'given', 'gone', 'ground', 'grown', 'hung', 'had', 'heard', 'hidden', 'hit', 'held', 'hurt', 'kept', 'knelt', 'known', 'laid', 'led', 'learned', 'left', 'lent', 'lain', 'lied', 'lit', 'lost', 'made', 'ment', 'met','mowed', 'overtaken', 'paid', 'put', 'read', 'ridden', 'rung', 'risen', 'run', 'sawn', 'said', 'seen', 'sold', 'sent', 'set', 'sewn', 'shaken', 'shed', 'shone', 'shot', 'shown', 'shrunk', 'shut', 'sung', 'sunk', 'sat','slept', 'slid', 'smelt', 'sown', 'spoken', 'spelled', 'spent', 'spilled', 'spat', 'spread', 'stood', 'stolen', 'stuck', 'stung', 'stunk', 'struck', 'sworn', 'swept', 'swollen', 'swum', 'swung', 'taken', 'taught', 'torn', 'told', 'thought', 'thrown', 'understood', 'woken', 'worn', 'wept', 'won', 'wound', 'written']

Then I converted these to present tense with Spacy (as a reference):
['arise', 'awake', 'be', 'bear', 'beat', 'become', 'begin', 'bend', 'bet', 'bind', 'bite', 'bleed', 'blow', 'break', 'breed', 'bring', 'broadcast', 'build', 'burn', 'burst', 'buy', 'be', 'catch', 'choose', 'clung', 'come', 'cost', 'creep', 'cut', 'delt', 'dig', 'do', 'dream', 'drunk', 'drive', 'eat', 'fall', 'feed', 'feel', 'fight', 'find', 'fly', 'forbid', 'forget', 'forgive', 'freeze', 'get', 'give', 'go', 'ground', 'grow', 'hang', 'have', 'hear', 'hide', 'hit', 'hold', 'hurt', 'keep', 'knelt', 'know', 'lay', 'lead', 'learn', 'leave', 'lend', 'lie', 'lie', 'light', 'lose', 'make', 'ment', 'meet', 'mow', 'overtaken', 'pay', 'put', 'read', 'ride', 'ring', 'rise', 'run', 'sawn', 'say', 'see', 'sell', 'send', 'set', 'sew', 'shake', 'shed', 'shine', 'shoot', 'show', 'shrink', 'shut', 'sing', 'sink', 'sit', 'sleep', 'slide', 'smell', 'sow', 'speak', 'spell', 'spend', 'spill', 'spat', 'spread', 'stand', 'steal', 'stick', 'sting', 'stunk', 'strike', 'swear', 'sweep', 'swell', 'swum', 'swing', 'take', 'teach', 'tear', 'tell', 'think', 'throw', 'understand', 'wake', 'wear', 'weep', 'win', 'wound', 'write']

Then I tried to convert these to present tense with mlconjug:
[only 23 out of 133 conjugated correctly]

Number Correct: 23
['bind', 'bind', 'become', 'bind', 'bet', 'bind', 'broadcast', 'burst', 'come', 'cost', 'cut', 'find', 'ground', 'hit', 'keep', 'put', 'read', 'run', 'set', 'shed', 'shut', 'spat', 'wound']

Number Incorrect: 110
['ariind', 'awoind', 'beaind', 'beguncast', 'bound', 'bitind', 'blind', 'broind', 'brep', 'brouind', 'buind', 'buind', 'bouind', 'been able to', 'cauind', 'choind', 'clind', 'crind', 'dind', 'dugide', 'done', 'dreind', 'drind', 'driind', 'eaind', 'falind', 'fep', 'fouind', 'found', 'flind', 'forbidind', 'forgotind', 'forgiind', 'froind', 'got', 'giind', 'gone', 'grind', 'hind', 'hep', 'heind', 'hidind', 'hind', 'kind', 'knind', 'knind', 'lind', 'lep', 'learind', 'lind', 'lind', 'lind', 'lind', 'lit', 'lind', 'made', 'mind', 'met', 'moind', 'overtaind', 'pind', 'ridind', 'rind', 'riind', 'sind', 'sind', 'sind', 'sind', 'sind', 'sind', 'shaind', 'shone', 'shot', 'shind', 'shrind', 'sind', 'sind', 'sat', 'slind', 'sind', 'smelt', 'sind', 'spoind', 'spelind', 'spind', 'spilind', 'sprake', 'stind', 'stoind', 'stind', 'stind', 'stind', 'strind', 'swind', 'swind', 'swolind', 'swul', 'swind', 'taind', 'tauind', 'tind', 'tind', 'thouind', 'thrind', 'understind', 'woind', 'wind', 'wind', 'wonind', 'writind']

The following code was used to generate the lists above.
It's possible that I made a mistake.
default_conjugator = mlconjug.Conjugator(language='en') mlOutput = [] for item in verbList: pp = default_conjugator.conjugate(item).conjug_info['indicative']['indicative present']['1s'] mlOutput.append(pp)

Hi @NoelHVincent , I am @SekouDiaoNlp, formerly on GitHub as @SekouD and I am the original author of mlconjug. I created a new GitHub account at @SekouDiaoNlp as my former account was mostly for my academic projects, while my new account @SekouDiaoNlp is more focused on my professional projects as well as my open-source projects.
I have just released a new version of mlconjug called mlconjug3, to indicate that I dropped support for Python 2.x as it has been deprecated by the end of 2019.

To install it just run :
pip install mlconjug3

And to answer your initial bug request, the bug has been fixed in version 3.7.1 of mlconjug3

Have a nice day and thank you for using mlconjug3.

Cheers!