NUS SMS Corpus

Due to some technicial problems, the NUS SMS Corpus website [http://wing.comp.nus.edu.sg/SMSCorpus] (http://wing.comp.nus.edu.sg:8080/SMSCorpus) is temporally unavailable. For your convenience, we upload the most recent release (Mar 9, 2015) of the corpus here.

Please cite the following paper if you use our corpus. Thanks!

Tao Chen and Min-Yen Kan (2013). [Creating a Live, Public Short Message Service Corpus: The NUS SMS Corpus] (http://link.springer.com/article/10.1007%2Fs10579-012-9197-9). Language Resources and Evaluation, 47(2)(2013), pages 299--355.

Language | File Format | Size | Number of Messages 
------------ | ------------- | -------------  | -------------
English | SQL | [2,045K] (smsCorpus_en_sql_2015.03.09_all.zip) | 55,835
English | XML | [2,359K] (smsCorpus_en_xml_2015.03.09_all.zip) | 55,835
Chinese | SQL | [979K](smsCorpus_zh_sql_2015.03.09.zip)   | 31,465
Chinese | XML | [1,182K](smsCorpus_zh_xml_2015.03.09.zip) | 31,465