Pinned Repositories
AlphaReadabilityCalculator
Alpha Readability Calculator is a wrapper of "readability", which helps calculate nine readability indices as well as 29 measures at lexical and syntactic levels.
AlphaReadabilityChinese
AlphaReadabilityChinese is a tool that calculates the readability of Chinese texts, which includes indices at lexical, syntactic, and semantic levels.
leileibama.github.io
Lei Lei Homepage
LeoColloSharp
LeoColloSharp, a tool to search collocates.
leoDDcalculator
leoDDcalculator, a package calculating the values of mdd and ndd of texts in a folder
leolemmatizer
leolemmatizer, a package postagging and lemmatizing text files in a folder
leopythonbookdata
Data and codes of Corpus Data Processing with Python
LinguisticFeatures
Linguistic features calculation for quantitative/corpus linguistics study
leileibama's Repositories
leileibama/AlphaReadabilityChinese
AlphaReadabilityChinese is a tool that calculates the readability of Chinese texts, which includes indices at lexical, syntactic, and semantic levels.
leileibama/leoDDcalculator
leoDDcalculator, a package calculating the values of mdd and ndd of texts in a folder
leileibama/LeoColloSharp
LeoColloSharp, a tool to search collocates.
leileibama/AlphaReadabilityCalculator
Alpha Readability Calculator is a wrapper of "readability", which helps calculate nine readability indices as well as 29 measures at lexical and syntactic levels.
leileibama/leileibama.github.io
Lei Lei Homepage
leileibama/leolemmatizer
leolemmatizer, a package postagging and lemmatizing text files in a folder
leileibama/leopythonbookdata
Data and codes of Corpus Data Processing with Python
leileibama/LinguisticFeatures
Linguistic features calculation for quantitative/corpus linguistics study
leileibama/4675-scifi
chinese NLP corpus of chinese science fiction,chinese science fiction corpus : About 4675 Chinese science fiction novels 大约有4675本科幻小说,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
leileibama/COCA-WordFrequency
COCA, Top 5000 Word Frequency List
leileibama/core-books
中國古代基本典籍
leileibama/dataprep
Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.
leileibama/ELLIPSE-Corpus
the English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus
leileibama/faststylometry
Stylometry library for Burrows' Delta method
leileibama/hyzd
開放漢語字典 - 現代漢語字音數據庫
leileibama/Jiayan
甲言,专注于古代汉语(古汉语/古文/文言文/文言)处理的NLP工具包,支持文言词库构建、分词、词性标注、断句和标点。Jiayan, the 1st NLP toolkit designed for Classical Chinese, supports lexicon construction, tokenizing, POS tagging, sentence segmentation and punctuation.
leileibama/Metaphor_Generator
The first Chinese metaphor corpus serving for identification and generation. 中文比喻数据集. Presented at COLING 2022.
leileibama/MorphyNet
MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)
leileibama/parser
:rocket: State-of-the-art syntactic/semantic parsers, with pretrained models for more than 19 languages.
leileibama/persuade_corpus_2.0
This is the data associated with the PERSUADE Corpus 2.0 version
leileibama/pycorrector
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。
leileibama/pydelta
an experimental implementation of Burrow's delta in Python 3
leileibama/pysenti
Chinese Sentiment Classification Tool. 情感极性分类,基于知网、清华、BosonNLP情感词典,易扩展,基准方法,开箱即用。
leileibama/PySide6-Code-Tutorial
可能是最好的PySide6中文教程!用代码实例讲解PySide6,附优质Demos、图标库、QSS皮肤、相关文章等分享!
leileibama/QuitaUp
QuitaUp: A tool for quantitative stylometric analysis
leileibama/RedditBias
Code & Data for the paper "RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models"
leileibama/rmrb
人民日报(1946-2003)
leileibama/s2orc
S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/
leileibama/spacy-visualise-tree
Create dependency tree plots from SpaCy Doc objects
leileibama/tammi
Base code for the Tool for Automatic Measurement of Morphological Information (TAMMI)