Issues
- 1
how to only compute the perplexity of each paragraph using your language model with local data?
#54 opened by rongjingyue423 - 4
Running on local files
#22 opened by sashavor - 1
- 2
从wet格式中提取文本
#46 opened by wwfcnu - 0
- 1
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url
#49 opened by Hieunohair - 0
Can reproduce still run normally?
#51 opened by newbietuan - 2
Numerous Errors
#45 opened by conceptofmind - 0
win10 use cc_net
#50 opened by z-x-x136 - 12
- 0
CC-100 in statmt version is different from paper
#48 opened by nbqu - 0
Annotation statistics
#47 opened by mauriceweber - 0
The final json files are not as expected
#44 opened by nengyinyibeiwu - 0
- 4
EOFError: Compressed file ended before the end-of-stream marker was reached
#11 opened by zl827154659 - 0
- 1
- 0
when use odoo 16.0 in pycharm show this Error
#37 opened by mohamedGaber93 - 5
- 3
- 2
403 forbidden while downloading
#35 opened by Raven-Ren - 4
Error when Running 2020-34 dumps
#16 opened by Phil1108 - 1
Error: Mining phase failure
#36 opened by AssisRaphael - 1
Variance of hash files sizes in newer crawls
#27 opened by var926 - 5
- 0
- 1
- 2
make dl_all_lm failing
#20 opened by sashavor - 6
- 2
- 0
Question about the size of Roberta-small
#28 opened by MatthewCYM - 1
- 4
support of Hausa
#9 opened by donglixp - 0
Model finding
#21 opened by sashavor - 1
Are not all languages in the paper supported?
#18 opened by feddybear - 3
- 7
Cannot download the precpomputed files
#7 opened by yinfeiy-g - 1
- 6
- 2
ERROR: Package u'cc-net' requires a different Python: 2.7.12 not in '>=3.7'
#12 opened by Nanamumuhan - 4
Failing to use mp execution
#4 opened by alexandremuzio - 3
- 2
- 13
ChunkedEncodingError & ConnectionResetError
#2 opened by soloice