linguistic-analysis

There are 145 repositories under linguistic-analysis topic.

  • DmitryRyumin/INTERSPEECH-2023-24-Papers

    INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

  • brucewlee/lingfeat

    [EMNLP 2021] LingFeat - A Comprehensive Linguistic Features Extraction ToolKit for Readability Assessment

    Language:Python1291916
  • jtanwk/nytcrossword

    An exploration of New York Times crossword answers from 1994-2017, i.e. the Will Shortz era.

    Language:HTML122418
  • LSYS/LexicalRichness

    :smile_cat: :speech_balloon: A module to compute textual lexical richness (aka lexical diversity).

    Language:Python11033022
  • THU-KEG/ChatLog

    ⏳ ChatLog: Recording and Analysing ChatGPT Across Time

    Language:Jupyter Notebook102633
  • sillsdev/FieldWorks

    FieldWorks is a suite of software tools for language and cultural data, with support for complex scripts.

    Language:C#9811038
  • Constituent-Treelib

    Halvani/Constituent-Treelib

    A lightweight Python library for constructing, processing, and visualizing constituent trees.

    Language:Jupyter Notebook672512
  • nickduran/align-linguistic-alignment

    Python library for extracting quantitative, reproducible metrics of multi-level alignment between speakers in naturalistic language corpora.

    Language:Python5232516
  • STRZGR/Natural-Language-Processing-with-Python-Analyzing-Text-with-the-Natural-Language-Toolkit

    My solutions to selected exercises to "Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit" by Steven Bird, Ewan Klein, and Edward Loper.

    Language:Jupyter Notebook513036
  • phonet

    jcvasquezc/phonet

    Keras-based python framework to compute phonological posterior probabilities from audio files

    Language:Python442317
  • livingtongues/living-dictionaries

    Speeding the availability of language resources for endangered languages. Tools such as this have the power to shift how we think about endangered languages. Rather than perceiving them as being antiquated, difficult to learn and on the brink of vanishing, we see them as modern, easily accessible for learning online in text and audio formats.

    Language:TypeScript3721552
  • fidelisrafael/esperanto-analyzer

    Morphological and syntactic analysis of Esperanto sentences

    Language:Python33401
  • NEU-DSG/dailp-encoding

    Digital Archive of American Indian Languages Preservation and Perseverance

    Language:TypeScript2312614
  • hoangsonww/Amazon-Reviews-Analysis

    🧐 This project analyzes Amazon Fine Food Reviews to investigate whether negative reviews are more emotionally intense and lexically repetitive than positive ones. Using R, we apply sentiment analysis and lexical diversity metrics to uncover patterns in consumer review language.

    Language:R22
  • hoangsonww/Malawian-CiTonga-Tone-Production

    🇲🇼 A project analyzing how onset consonant type affects tone realization in Malawian CiTonga verb stems, using pitch (F₀) data from phonetic fieldwork. Includes two experiments comparing mean F₀ across tonal and consonantal contexts, with statistically significant findings and clear visualizations.

    Language:R2112012
  • hoangsonww/Pokemon-Name-Physique-Analysis

    🐱 A project exploring relationships between Pokémon names and physical traits using R, with string-based pattern detection, group comparisons based on consonant “heaviness,” and regression models predicting weight from height and Attack. Includes hypothesis-driven name analyses and statistical summaries for both English and Japanese name sets.

    Language:R21
  • hoangsonww/Brazilian-Portuguese-Nonce-Accessbility

    🇧🇷 A project for analyzing acceptability judgments of Brazilian Portuguese nonce words using R, focusing on syllable length and initial segment type. Includes mosaic plots and chi-square tests to assess structural effects on responses, with results suggesting no significant influence from either factor.

    Language:R20
  • korpling/graphANNIS

    This is a new backend implementation of the ANNIS linguistic search and visualization system.

    Language:Rust186401
  • n3a9/vera

    Winner of LA Hack's Award Best Use of Wolfram Tech 🎉 An AI system to determine if a given statement is true or false.

    Language:JavaScript18432
  • katreparitosh/Discourse-Analytics-of-Political-Speech-Transcripts

    Political Discourse Analysis (PDA) of Political Speech Transcripts using Natural Language Processing (NLP)

    Language:Jupyter Notebook16001
  • i-amritpal/Feature-based-fake-review-detection

    This project related to one of my B.Tech final year project that investigates the influence of linguistic and sentiment analysis features on detecting fake reviews in e-commerce (Amazon).

    Language:Jupyter Notebook13100
  • TALP-UPC/saga

    SAGA - Phonetic transcription software for all Spanish variants.

    Language:C12326
  • matthias-stemmler/annimate

    Your Friendly ANNIS Match Exporter

    Language:TypeScript111180
  • public-law/readability

    How readable is your text? Provide a text input and get its grade level. Validated against the source data.

    Language:Python11201
  • fidelisrafael/esperanto-analyzer-react

    Front-end application for 'Esperanto Grammar Analyzer' built with React.js.

    Language:JavaScript8251
  • jklu-jaipur/Political-Biasness-Detection

    Our ML model calculates the biasness of a political article based on linguistic features and classifies them as biased towards the ruling government, bias towards the opposition, or neutral.

    Language:Jupyter Notebook8023
  • audreycs/ImpScore

    A repository for paper ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Sentences accepted to ICLR 2025.

    Language:Python7130
  • Itabashi-don/Shiina

    板橋在住の女子高生、しいちゃんですっ( ˙꒳​˙ )

    Language:JavaScript7200
  • unrealtecellp/life

    Linguistic Field Data Management and Analysis System [LiFE]

    Language:Python7301
  • arjo129/LangCluster

    A visuallization for cognates in various languages and how they spread

    Language:Python6202
  • devSuchit/nlp-cky-PCFG

    This repository contains an implementation of the CKY parsing for English. (NLP)

    Language:Python6000
  • GiellaLT-Archive/giella-shared

    Shared linguistic resources, like names, digits, fst filtering and dependency parsing.

    Language:Rich Text Format62220
  • bhalla98/LinguisticTagger

    Segments natural language text and tags it with different parts of speech.

    Language:Swift5100
  • Abe-Alefew/LexiLink

    The aim of this mini-project is to to analyze the text and phonemic similarities between the Afan Oromo and Somali languages by examining word frequency, overlap, and phonemic distribution.

    Language:Python4
  • jjordanoc/robust-english-speech-fluency-classification

    Fluency level classifier of L2 English speech

    Language:Jupyter Notebook4100
  • mmmaurer/elfen

    A python package to efficiently extract linguistic features for text/NLP datasets

    Language:Python4113