bigrams

There are 98 repositories under bigrams topic.

  • ollie283/language-models

    Build unigram and bigram language models, implement Laplace smoothing and use the models to compute the perplexity of test corpora.

    Language:Python841141
  • starlordvk/Typing-Assistant

    Typing Assistant provides the ability to autocomplete words and suggests predictions for the next word. This makes typing faster, more intelligent and reduces effort.

    Language:CSS546113
  • susantabiswas/Word-Prediction-Ngram

    Next Word Prediction using n-gram Probabilistic Model with various Smoothing Techniques

    Language:Jupyter Notebook364311
  • dohliam/hawaiian-corpus

    Data from a corpus of written Hawaiian

  • KhaledAshrafH/Auto-Filling-Text

    This project is an auto-filling text program implemented in Python using N-gram models. The program suggests the next word based on the input given by the user. It utilizes N-gram models, specifically Trigrams and Bigrams, to generate predictions.

    Language:Python16414
  • rohitthapliyal2000/Sentiment-Analysis-NLTK

    Opinion mining for provided data from various NLTK corpus to test/enhance the accuracy of the NaiveBayesClassifier model.

    Language:Python14104
  • prigarg/Bigram-Language-Model-from-Scratch

    A Bigram Language Model from scratch with no-smoothing and add-one smoothing. Outputs bigram counts, bigram probabilities and probability of test sentence.

    Language:Jupyter Notebook11101
  • DigitalTools/nltk-book

    Jupyter Notebook for Natural Language Processing learning

    Language:Jupyter Notebook10204
  • sachin-bisht/YouTube-Sentiment-Analysis

    (UNMAINTAINED)Fetch comments from the given video and determine sentiment towards the video is positive or negative

    Language:Python9122
  • iAmKankan/Natural-Language-Processing-NLP-Tutorial

    NLP tutorials and guidelines to learn efficiently

  • senya-ashukha/bigram-anchor-words

    An Implementation of Bigram Anchor Words algorithm

    Language:Python8305
  • mochi-co/ngrams

    A Go n-gram indexer for natural language processing with modular tokenizers and data stores

    Language:Go7100
  • sohailahmedkhan/Sentence-Completion-using-Hidden-Markov-Models

    The goal of this script is to implement three langauge models to perform sentence completion, i.e. given a sentence with a missing word to choose the correct one from a list of candidate words. The way to use a language model for this problem is to consider a possible candidate word for the sentence at a time and then ask the language model which version of the sentence is the most probable one.

    Language:Python7000
  • word-embedding-italian-literature

    giocoal/word-embedding-italian-literature

    Using distibuctional semantics (word2vec family algorithms and the CADE framework) to learn word embeddings from the Italian literary corpuses we generated.

    Language:Python6102
  • burhanharoon/N-Gram-Language-Model

    It's a python based n-gram langauage model which calculates bigrams, probability and smooth probability (laplace) of a sentence using bi-gram and perplexity of the model.

    Language:Python5412
  • luizanisio/Doc2VecFacil

    Classe responsável por simplificar o processo de criação de um modelo Doc2Vec (gensim) com facilitadores para geração de um vocab personalizado e com a geração de arquivos de curadoria. Dicas usando elasticsearch e singlestore.

    Language:Python5101
  • gromag/Data-Science-Specialisation-Predict-Next-Word

    Predicting next word with Natural Language Processing. Being able to predict what word comes next in a sentence is crucial when writing on portable devices that don't have a full size keyboard. However the same techniques used in texting application can be applied to a variety of other applications, for example: genomics by segmenting DNA, sequences speech recognition, automatic language translation or even as one student in the course suggested music sequence prediction.

    Language:HTML3300
  • motiurinfo/sentiment_classification

    Performance evaluation of sentiment classification on movie reviews

    Language:Python3200
  • ricardobreis/Text-Mining-Acesso-Info-SP

    A text mining analysis about requests to information access to São Paulo municipality in 2018

    Language:R3100
  • sachin-bisht/Sentiment-Analysis-NLTK

    Sentiment Analysis / Opinion Mining for provided data in NLTK corpus using NaiveBayesClassifier Algorithm

    Language:Python3103
  • VaasuDevanS/Natural-Language-Processing-Assignments

    UNB Fall-2018 NLP Assignments 💬

    Language:Python3001
  • Adrianogba/bigrama-trigrama-python

    Este é um programa de inteligência artificial simples para prever a próxima palavra baseada em uma string informado usando bigramas e trigramas baseados em um arquivo .txt. Existem dois códigos, um usando console e outro usando o tkinter.

    Language:Python2100
  • AslanDevbrat/Computational-Linguistic

    Assigmnents of CL

    Language:Jupyter Notebook2101
  • DorinK/Deep-Learning-Gradient-based-Learning

    First assignment in ׳Deep Learning for Texts and Sequences' course (using NumPy only) by Prof. Yoav Goldberg at Bar-Ilan University

    Language:Python2101
  • Psmths/bigram-file-analysis

    Proof of concept that leverages machine learning to classify files based on their bigram frequency distributions.

    Language:Jupyter Notebook2100
  • ZNClub-PA-ML-AI/NLP-techniques

    Testing & learning different nlp and lex techniques

    Language:Jupyter Notebook2300
  • akshataupadhye/News-articles-clustering-A-comparative-approach

    A project featuring the use of various NLP techniques and ML algorithms like the topic modelling and paragraph embeddings, for document clustering. 📰📚

    Language:Jupyter Notebook1100
  • daverlon/ngram-wordgen

    Word generator using n-gram probabilities

    Language:Jupyter Notebook1100
  • fikrirazor/bigramindo

    bigram using python language, menggunakan kalimat berbahasa indonesia

    Language:Jupyter Notebook1110
  • gjorm/WordSeg

    Word Segmentation on strings with no spaces.

    Language:C++1002
  • mhasegawa7045/Film_Movie_Text_Mining_Sentimental_Analysis_Machine_Learning

    [Tokenization, Topic Modeling, Sentiment Analysis, Network of Bigrams] The purpose of this project is to see if text mining techniques can ease better analysis for categorizing movies with just the Descriptions while ignoring the Genre from the dataset, IMDB_movies.csv, which is stored under the data frame variable, movies_desc. Tokenization (TF-DF) was used to increase efficiency to analyze term frequencies in movie Descriptions so that the conceptual theme of a movie franchise would be determined even if a person has never watched any of the films. Create mixtures of terms that are correlated to every topic and the mixture of topics that distinguishes each document through Topic Modeling in the dataset, IMDB_movies.csv. Sentimental Analysis focused on Movies with Sentimental Clusters that were using bing and NRC lexicons to see how Sentiment affects Rating and Revenue. The network of bigrams for the Movies dataset help summarize how frequented Movie Description word-terms create term relationships and how they connect to other movies.

    Language:HTML1300
  • octokami/news_stock_market

    Predict stock price movements based on news articles. We used the BoW approach and sentiment analysis of titles of news articles.

    Language:Jupyter Notebook1140
  • pngo1997/N-gram-Language-Models

    Builds N-gram language modes and applies text generation.

    Language:Jupyter Notebook1100
  • sashakenjeeva/spell-corrector

    A context-sensitive, one-edit distance spelling corrector

    Language:Jupyter Notebook1100
  • zoobereq/Richness-of-the-Stimulus

    A replication of an experiment by Reali and Christiansen (2005) disputing the basic assumptions of Chomsky's Poverty of Stimulus theory.

    Language:Python1100