Pinned Repositories
__SamplePlugin
This is the core template around which most of the other plugins are built. If you're looking to build your own plugin, start with this!
BUTTER_Client
The main client application — the good stuff!
ChineseTokenizer
Chinese tokenizer built around CoreNLP.NET
ContentCoding
Frequency-based content coding, like that found in LIWC.
LIWCdicToCSV
Plugin to convert a LIWC-formatted dictionary file into a nice, easily-readable spreadsheet.
NarrativeArc
Plugin to calculate Narrative Arc scores. See https://www.arcofnarrative.com/ and the research paper at https://www.doi.org/10.1126/sciadv.aba2196
ReadabilityMetrics
Plugin to calculate several different indicies of "readability" (e.g., SMOG, Flesch-Kincaid, etc.)
SplitTextsIntoChunks
Plugin to segments texts in various ways.
WhitespaceTokenizer
Plugin for tokenizing texts via whitespace.
Word2Vec
Plugin to train a word2vec model.
BUTTER-Tools's Repositories
BUTTER-Tools/NarrativeArc
Plugin to calculate Narrative Arc scores. See https://www.arcofnarrative.com/ and the research paper at https://www.doi.org/10.1126/sciadv.aba2196
BUTTER-Tools/SplitTextsIntoChunks
Plugin to segments texts in various ways.
BUTTER-Tools/ContentCoding
Frequency-based content coding, like that found in LIWC.
BUTTER-Tools/CoreNLP.NET.POS.Tagger
Plugin that does Part Of Speech Tagging, built around CoreNLP.NET (https://sergey-tihon.github.io/Stanford.NLP.NET/StanfordCoreNLP.html)
BUTTER-Tools/ReadabilityMetrics
Plugin to calculate several different indicies of "readability" (e.g., SMOG, Flesch-Kincaid, etc.)
BUTTER-Tools/RegExReplace
Plugin to perform user-defined regex replacements in texts.
BUTTER-Tools/TwitterAwareTokenizer
Pluging that is essentially a C# port of the NLTK Twitter-Aware Tokenizer (i.e., the "casual" tokenizer: https://github.com/nltk/nltk/blob/develop/nltk/tokenize/casual.py)
BUTTER-Tools/WhitespaceTokenizer
Plugin for tokenizing texts via whitespace.
BUTTER-Tools/Word2Vec
Plugin to train a word2vec model.
BUTTER-Tools/Blueberries
Software to update BUTTER plugins
BUTTER-Tools/CompareFrequencies
Statistically compare the word frequencies from 2 or more BUTTER frequency lists.
BUTTER-Tools/ConceptCategoryDiversity
Plugin that runs the analyses described in Vine, Boyd, & Pennebaker (2020). See also: Vocabulate (https://github.com/ryanboyd/Vocabulate)
BUTTER-Tools/CoreNLPSentiment
Plugin to do sentiment analysis on a sentence-by-sentence basis. Built around CoreNLP.NET (https://sergey-tihon.github.io/Stanford.NLP.NET/StanfordCoreNLP.html)
BUTTER-Tools/ExamineDictionaryWords
Plugin to evaluate the statistical properties of a text analysis dictionary. Gives Mean and Standard Deviations for each word, plus internal consistency metrics for each category.
BUTTER-Tools/InputFilesDOCX
Plugin to read in texts from .docx files
BUTTER-Tools/LemmaGenLemmatizer
Plugin that wraps around the LemmaGen lemmatizer (http://lemmatise.ijs.si/Software/Version3)
BUTTER-Tools/LexicalDiversity
Plugin to calculate lexical diversity/richness scores. Includes things like type-token ratio, etc.
BUTTER-Tools/LookupLemmatizer
Plugin to lemmatize based on pre-defined lists. At the time of writing this description, the lists used are primarily from https://github.com/michmech/lemmatization-lists
BUTTER-Tools/OmitObservation
Plugin to drop observations that fall below a user-specified number of tokens.
BUTTER-Tools/OutputFile_BuildCorpusTXT
Plugin to write a corpus of texts into a single .txt file (with each text separated by a newline).
BUTTER-Tools/OutputFileCSV
Plugin to write output to a CSV file. Used by most plugin chains to write your output.
BUTTER-Tools/OutputFilesTXT
Plugin to write strings out into separate .txt files.
BUTTER-Tools/Phrasifier
This plugin will use a BUTTER frequency list to replace individual words with phrases. This is useful for taking n-grams and joining them into single tokens using collocation metrics. A useful preprocessing step for something like word2vec.
BUTTER-Tools/PlugIndex
Indexer for the plugins. Not the smartest support tool, but gets the job done.
BUTTER-Tools/ReceptivitiAPI
Plugin to interface with the Receptiviti API (see https://www.receptiviti.com/ and https://receptiviti.github.io/api-docs/). Requires that you have your Receptiviti API keys available.
BUTTER-Tools/StopList
Plugin that contains stop lists and will remove tokens from a text before carrying forward
BUTTER-Tools/Syrup
Hashing/indexing/etc. Just a part of the toolchain for package distribution.
BUTTER-Tools/TokenizedText2String
Plugin that takes tokens and concatenates them back into a single string. Useful for when you want to tokenize text, do a bunch of preprocessing, then output the preprocessed string.
BUTTER-Tools/VADER
Plugin that wraps around the VADER sentiment analysis method (specifically, the VADER Sharp implementation).
BUTTER-Tools/WeightedDictionary
Plugin for analyzing texts via a weighted dictionary.