Lowercase of all languages needed in utils.py
Closed this issue · 1 comments
Manamama commented
In the utils.py, I needed to change to language.lower()
def normalize_language(language):
for lookup_key in ("alpha_2", "alpha_3"):
try:
lang = languages.get(**{lookup_key: language})
if lang:
language = lang.name.lower()
except KeyError:
pass
return language.lower()
so as to avoid cryptic errors when the language name was capitalized:
sumy text-rank --format=html --language=Polish
sumy text-rank --format=html --language=French
etc.
->
> LookupError: NLTK tokenizers are missing or the language is not supported.
> Download them by following command: python -c "import nltk; nltk.download('punkt')"
> Original error was:
>
> **********************************************************************
> Resource punkt not found.
> Please use the NLTK Downloader to obtain the resource:
>
> >>> import nltk
> >>> nltk.download('punkt')
>
> For more information see: https://www.nltk.org/data.html
>
> Attempted to load tokenizers/punkt/PY3/Polish.pickle
>
Otherwise the package is great.
miso-belica commented
Thank you, should be fixed in main now.