The purpose of this task is to build a program which can grade a Japanese text by how many different types of kanji (characters, 漢字) are used.
Japanese kanji can be grouped in different ways. In this exercise we will use the Japanese Language Proficency Test (JLPT) grouping. There are 5 levels.
- Level 5 (N5) roughly contains 80 kanji
- Level 4 (N4) has a further 170 kanji
- Level 3 (N3) then has 370 kanji more
- Level 2 (N2) has 380 kanji
- Level 1 (N1) has a final 1136 kanji
If one has mastered N1, then it is implied than one has also mastered N5-N4.
The end goal of this exercise is to create a commandline program which can be invoked as follows with an output similar to
> kanji_grader my_japanese_book.txt
N5: 55%
N4: 69%
N3: 83%
N2: 95%
N1: 99%
First install poetry.
And then run
poetry run pytest
This should result in an output ending with
FAILED tests/test_task1.py::test_one_n5 - ValueError: TODO
FAILED tests/test_task1.py::test_irrelevant_chars - ValueError: TODO
FAILED tests/test_task1.py::test_mix_of_chars - ValueError: TODO
Then everything is good!
Write your code in grader.py
. Task 1 is finished when tests in test_task1
are green.