Chinese sentences with simplified, traditional, pinyin and english translation for offline use in app.
Sentences data taken from Tatoeba, pinyin and traditional characters generated using python module pinyin_jyutping_sentence
and hanziconv
. Translation generated using Google sheets.
Total 63352 sentences in db and tsv file.
This is tab separated file.
| id | Simplified | Traditional | Pinyin | English |
10 我不知道。 我不知道。 wǒ bù zhīdào 。 I do not know.
The sen_data.db contains table examples
with id, simplified, traditional, pinyin, english
.
Get two random sentences with pinyin, traditional characters and translation
View read_2_random_sen.py
- Download sentences database from Tatoeba
- Use Google translate to translate the sentences
- Use Python module
pinyin_jyutping_sentence
andhanziconv
to generate pinyin and traditional characters for sentences - Use gen_sen.py and write data to
.tsv
file - Use tsv_to_db.py python code to create databases.