
There are 64 repositories under stylometry topic.

  • JasonKessler/scattertext

    Beautiful visualizations of how language differs among document types.

  • evllabs/JGAAP

    The Java Graphical Authorship Attribution Program

  • Jur1cek/gcj-dataset

    Collected solutions from Google Code Jam programming competition (2008-2020).

  • dykang/PASTEL

    Data and code for Kang et al., EMNLP 2019's paper titled "(Male, Bachelor) and (Female, Ph.D) have different connotations: Parallelly Annotated Stylistic Language Dataset with Multiple Personas"

  • fastdatascience/faststylometry

    Stylometry library for Burrows' Delta method

    Language:Jupyter Notebook25346
  • goldmonkey21/doxer

    Stylometric Data Mining Library with a focus on identifying Satoshi Nakamoto as a case study.

  • SupervisedStylometry/SuperStyl

    Supervised Stylometry

  • Jur1cek/source2vec

    Source code embeddings for various programming languages

    Language:Jupyter Notebook15303
  • mullerpeter/authorstyle

    Python package to deal with PAN corpora and extract stylometric features from text documents.

  • rakshithShetty/A4NT-author-masking

    Repository for author masking

  • czcorpus/QuitaUp

    QuitaUp: A tool for quantitative stylometric analysis

  • a-coles/SMS-Stylometry

    A tool that predicts the dialect of English of an SMS message using recurrent neural networks supplemented with data from Google Trends.

  • christofs/stylometry-bibliography

    Bibtex copy of the Zotero bibliography on Stylometry

  • severinsimmler/shylo

    A Shiny GUI for Stylo

  • 7PartidasDigital/AnalisisTextual

    Todo lo accesorio y entorno al proyecto sobre Análisis de textos con R

  • Michaeljfang/PyGAAP

    The Python Graphical Authorship Attribution Program — An experimental Python port of the Duquesne University Evaluating Variations in Language Lab's JGAAP.

  • payloadpl/stylometria

    Usage of stylometry and machine learning in computer forensics - real tools used in 2019 by the polish police. Everything in/for polish language.

  • rafayetrafi/BanglaMusicStylo-A-Stylometric-Dataset-of-Bangla-Music-Lyrics

    With the rapid growth of Bangla music industry huge volume of Bangla songs are produced every day. Immense number of producers, lyricists, singers and artists are involved in production of songs from different genres. Among many genres of Bangla music; classical, folk, baul, modern music, Rabindra Sangeet, Nazrul Geeti, film music, rock music and fusion music has gained the highest popularity. Lyricists try to express their feelings and views towards any situation or subject through their writings. Therefore, each lyricist have their own dictionary of thoughts to put on music lyrics. In this paper, we have presented “BanglaMusicStylo”, the very first stylometric dataset of Bangla music lyrics. We have collected 2824 Bangla song lyrics of 211 lyricists in a digital form. All the lyrics are stored in text format for further use. This dataset could be used for stylometric analysis such as authorship attribution, linguistic forensics, gender identification from textual data, Bangla music genre classification, vandalism detection, emotion classification etc. Identifying the significant research opportunities in this area, we have formalized this dataset which could be used for stylometric analysis.

  • bbrause/subrosa

    subtitle-based film similarities

  • burgos2021/programa

    Materiales para el curso de verano, «Del corpus a la interpretación: Estilometría con R», Burgos, 2021

  • Jero2760/estilometria

    Corpus abierto de obras en español en formato txt para estudios de estilometría

  • Stylometric-Analysis-on-British-Political-Speeches


    An exploratory research project focussing on extracting and analysing speeches from British political leaders, chief among them Winston Churchill.

    Language:Jupyter Notebook3111
  • top-on/llmask

    A command-line tool for masking authorship of text, by changing the writing style with a Large Language Model.

  • versotym/stichometry

    Stylometric analysis of poetic texts based on their versification

  • ABC-DH/EnExDi2020

    Materials for EnExDi2020 (Poitiers, February 10-14):

  • ancatmara/DH-Voronovo-Stylometry-2017

    Stylometry in R: Materials of the II Moscow-Tartu school in Digital Humanities

    Language:Jupyter Notebook2402
  • arojascastro/fabulasmitologicas

    A collection of Golden Age poems in Spanish in TEI and plain text

  • gmikros/Stylo-Tutorial

    This is a short introductory tutorial on Stylo package in R language

  • jeffasante/authorship-attribution

    Authorship attribution in tweeting.

  • jmclawson/stylo2gg

    Visualize and explore stylo data with ggplot2

  • Parallel-doc-embeds


    Comparison of classification power (literary authorship attribution case) of word-based, lemma-based, POS-based and mBERT-based document embeddings, as well as their combinations.

  • VictorIJnr/bu

    I like the name bu, but I called this User Stylometry Association, or UStylA, in my paper. In short, this just clusters users based on their stylometry - how they write stuff. This ended up as my Senior Honours project at The University of St Andrews. I had more ambitious plans but I didn't have enough time for them. This isn't half bad either though.

  • Darkar25/CSGAAP

    C# implementation of evllabs's JGAAP

  • ecomp-shONgit/stylo-ah-online

    Stylo ah online is an online tool to compute comparative text analysis in your browser. It implements the pipeline consisting of text (string) normalization, string decomposition (into token / features), counting and building up a feature vector, measure computation (create a distance matrix) and clustering.

  • isabel-mm/stylo-r-novels

    R+Python code for stylometric analysis on a corpus of Anglophone novels.

  • NKCZ/atds2022stylo

    Files for the Arbeitstagung der Skandinavistik 2022

    Language:Jupyter Notebook1100