100,000 sets of Chinese and Uighur language parallel translation corpus, data storage format is txt document, data fluency and loyalty is above 80%. Data cleaning, desensitization and quality inspection have been carried out, which can be used as a basic corpus for text data analysis and in fields such as machine translation. For more details, please refer to the link: https://www.nexdata.ai/datasets/nlu/149?source=Github
TXT
Chinese-Uighur Parallel Corpus Data
0.1 million pairs of Chinese-Uighur Parallel Corpus Data
Chinese, Uighur
machine translation
Commercial License