language-processing

There are 255 repositories under language-processing topic.

  • MarginaliaSearch/MarginaliaSearch

    Internet search engine for text-oriented websites. Indexing the small, old and weird web.

    Language:HTML1.6k615342
  • lingua-go

    pemistahl/lingua-go

    The most accurate natural language detection library for Go, suitable for short text and mixed-language text

    Language:Go1.3k114074
  • lingua-rs

    pemistahl/lingua-rs

    The most accurate natural language detection library for Rust, suitable for short text and mixed-language text

    Language:Rust1k77949
  • lingua

    pemistahl/lingua

    The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

    Language:Kotlin7781013275
  • classifai

    10up/classifai

    Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence.

    Language:PHP6787646266
  • NirantK/NLP_Quickbook

    NLP in Python with Deep Learning

    Language:Jupyter Notebook596292231
  • knadh/dictpress

    A stand-alone web server application for building and publishing full fledged dictionary websites and APIs for any language.

    Language:Go408102649
  • MycroftAI/padatious

    A neural network intent parser

    Language:Python162172042
  • gemengtju/Tutorial_Speech_Signal_Processing

    This repo summarizes the courses and materials for speech signal processing. You are kindly invited to pull requests.

  • sefineh-ai/Amharic-Tokenizer

    Syllable-aware BPE tokenizer for the Amharic language (አማርኛ) – fast, accurate, trainable.

    Language:Cython95
  • WZBSocialScienceCenter/germalemma

    A lemmatizer for German language text

    Language:Python9211411
  • AdyTech99/volo

    An F/OSS solution combining AI with Wikipedia knowledge via a RAG pipeline

    Language:Python69321
  • ysenarath/sinling

    A collection of NLP tools for Sinhalese (සිංහල).

    Language:Jupyter Notebook578620
  • pytamil

    srix/pytamil

    பைந்தமிழ் (pytamil) library is intended to be used in analysis of tamil literary work. A wealth of knowledge is hidden in old literature. They are time machines to past. Ever wondered what is the popular color or food in tamil speaking world in 500AD. The answer is hidden in literature. With right computer tools it becomes possible for us to dig in to this wealth of knowledge.

    Language:Python54839
  • mako443/Text2Pos-CVPR2022

    Code, dataset and models for our CVPR 2022 publication "Text2Pos"

    Language:Python504126
  • parallel-corpora-tools

    M4t1ss/parallel-corpora-tools

    Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.

    Language:PHP414517
  • TimKam/schreib-gut

    German extension for write-good

    Language:JavaScript38501
  • imsanjoykb/German-Language-Learning-Resource

    German Language Learning Resource

  • versotym/rhymetagger

    A simple collocation-driven recognition of rhymes. Contains pre-trained models for Czech, Dutch, English, French, German, Russian, and Spanish poetry

    Language:Python32424
  • ActiveNick/Unity-SpeechWithLUIS

    Sample Unity project used to demonstrate the integration of Speech Recognition and Language Understanding using the new Microsoft Speech Service (Preview) and LUIS from Microsoft Cognitive Services.

    Language:C#30505
  • mapado/pynlg

    ``pynlg`` is a pure python re-implementation of [SimpleNLG-EnFr](https://github.com/rali-udem/SimpleNLG-EnFr), a java library enabling bilingual [text surface realisation](https://en.wikipedia.org/wiki/Realization_%28linguistics%29), based on [SimpleNLG](https://github.com/simplenlg/simplenlg).

    Language:Python2914710
  • pigoz/lat

    Tools to automate language acquisition through immersion. Includes sentence analysis (from books, subtitles) and Anki cards creation.

    Language:Ruby281160
  • triatebr/aprenda-python

    Aprendizado, dicas e projetos sobre Python

    Language:Jupyter Notebook23207
  • Near32/ReferentialGym

    This framework provides out-of-the-box implementations of Referential Games variants in order to study the emergence of artificial languages using deep learning, relying on PyTorch (https://www.pytorch.org).

    Language:Python22404
  • RMNCLDYO/groq-ai-toolkit

    A lightweight Python API wrapper and CLI for Groq’s offering of language models using their ultra fast LPU Inference Engine.

    Language:Python22105
  • searchpioneer/lingua-dotnet

    Natural language detection library for .NET, suitable for long and short text alike

    Language:C#22122
  • mujeebishaque/language-detector

    this software detects the language of the website. It goes over list of url provided and saves the url + language in an excel sheet

    Language:Python17101
  • martinferianc/C90Compiler-EIE2

    C90 to MIPS I Compiler done as a coursework for EE2-15

    Language:C++16202
  • ishto7/persianutils

    Standardize your Persian text: Preprocessing, Embedding, and more!

    Language:Python15402
  • FORMAS/DptOIE

    Language:Java14516
  • vignif/lex-yacc-SQL-parser

    Simple parser for SQL standard language, this tool is developed using Lex and Yacc, project made for Language Processing Technologies @diism University of Siena. feel free to use it for academic purposes

    Language:C14103
  • verifid/ner-d

    Python module for Named Entity Recognition (NER) using natural language processing.

    Language:Python13313
  • lexected/astir

    A flexible parser generator producing output from object-oriented hierarchical context-free grammar specifications.

    Language:C++12230
  • antlr4-experiments

    melchisedech333/antlr4-experiments

    :wrench: My studies on context-free grammar, using ANTLR4 (C++) to generate the parser files. Some basics are developed, such as token processing, recursion, variable definition, array processing, Abstract Syntax Tree (AST) manipulation, UNICODE support, and error handling.

    Language:Java11201
  • RacimRgh/Dictionnaire-medical-Python-Unitex

    A python scraper that generates a medical dictionnary from vidal.fr, then enhance it using Unitex/Gramlab

    Language:Python11102
  • shamspias/google-meet-translator-extension

    Google Meet Transcript Translator is a Chrome extension that translates live transcriptions during a Google Meet call into your chosen language. Enhance your global communication.

    Language:JavaScript11202