topic-modeling

There are 1890 repositories under topic-modeling topic.

  • ddbourgin/numpy-ml

    Machine learning, in numpy

    Language:Python15.8k461503.8k
  • gensim

    piskvorky/gensim

    Topic Modelling for Humans

    Language:Python15.7k4301.8k4.4k
  • BERTopic

    MaartenGr/BERTopic

    Leveraging BERT and c-TF-IDF to create easily interpretable topics.

    Language:Python6.3k521.7k773
  • ddangelov/Top2Vec

    Top2Vec learns jointly embedded topic, document and word vectors.

    Language:Python3k37332374
  • baidu/Familia

    A Toolkit for Industrial Topic Modeling

    Language:C++2.6k157103594
  • JasonKessler/scattertext

    Beautiful visualizations of how language differs among document types.

    Language:Python2.3k55101293
  • ContextLab/hypertools

    A Python toolbox for gaining geometric insights into high-dimensional data

    Language:Python1.8k60197161
  • nomic-ai/nomic

    Interact, analyze and structure massive text, image, embedding, audio and video datasets

    Language:Python1.4k2765175
  • owlbarn/owl

    Owl - OCaml Scientific Computing @ https://ocaml.xyz

    Language:OCaml1.2k45300124
  • MilaNLProc/contextualized-topic-models

    A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).

    Language:Python1.2k17109147
  • dselivanov/text2vec

    Fast vectorization, topic modeling, distances and GloVe word embeddings in R.

    Language:R85354307134
  • OCTIS

    MIND-Lab/OCTIS

    OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

    Language:Python73915103106
  • bigartm/bigartm

    Fast topic modeling platform

    Language:C++66141364120
  • gregversteeg/corex_topic

    Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

    Language:Python6272941120
  • stepthom/text_mining_resources

    Resources for learning about Text Mining and Natural Language Processing

  • tomotopy

    bab2min/tomotopy

    Python package of Tomoto, the Topic Modeling Tool

    Language:C++5651216863
  • cpsievert/LDAvis

    R package for web-based interactive topic model visualization.

    Language:JavaScript5573279131
  • dongrixinyu/chinese_keyphrase_extractor

    An off-the-shelf tool for Chinese Keyphrase Extraction 一个快速从中文里抽取关键短语的工具,仅占35M内存 www.jionlp.com

    Language:Python54321368
  • vi3k6i5/GuidedLDA

    semi supervised guided topic model with custom guidedLDA

    Language:Python5011360110
  • stephenhky/PyShortTextCategorization

    Various Algorithms for Short Text Mining

    Language:Python466215172
  • jmartinezheras/2018-MachineLearning-Lectures-ESA

    Machine Learning Lectures at the European Space Agency (ESA) in 2018

    Language:Jupyter Notebook354311145
  • ruidan/Unsupervised-Aspect-Extraction

    Code for acl2017 paper "An unsupervised neural attention model for aspect extraction"

    Language:Python3381330117
  • primaryobjects/lda

    LDA topic modeling for node.js

    Language:JavaScript29271148
  • chtmp223/topicGPT

    TopicGPT: A Prompt-Based Framework for Topic Modeling (NAACL'24)

    Language:Python23951037
  • yangliuy/LDAGibbsSampling

    Open Source Package for Gibbs Sampling of LDA

    Language:Java233352214
  • cohere-ai/sandbox-topically

    Topic modeling helpers using managed language models from Cohere. Name text clusters using large GPT models.

    Language:Jupyter Notebook21811318
  • dice-group/Palmetto

    Palmetto is a quality measuring tool for topics

    Language:Java216197536
  • BobXWu/TopMost

    A Topic Modeling System Toolkit

    Language:Jupyter Notebook2022816
  • Concept

    MaartenGr/Concept

    Concept Modeling: Topic Modeling on Images and Text

    Language:Python19851916
  • WZBSocialScienceCenter/tmtoolkit

    Text Mining and Topic Modeling Toolkit for Python with parallel processing power

    Language:Python192161927
  • datquocnguyen/LFTM

    Improving topic models LDA and DMM (one-topic-per-document model for short texts) with word embeddings (TACL 2015)

    Language:Java178131359
  • maxent-ai/converse

    Conversational text Analysis using various NLP techniques

    Language:Jupyter Notebook1789419
  • binoydutt/Resume-Job-Description-Matching

    The purpose of this project was to defeat the current Application Tracking System used by most of the organization to filter out resumes. In order to achieve this goal I had to come up with a universal score which can help the applicant understand the current status of the match. The following steps were undertaken for this project 1) Job Descriptions were collected from Glass Door Web Site using Selenium as other scrappers failed 2) PDF resume parsing using PDF Miner 3) Creating a vector representation of each Job Description - Used word2Vec to create the vector in 300-dimensional vector space with each document represented as a list of word vectors 4) Given each word its required weights to counter few Job Description specific words to be dealt with - Used TFIDF score to get the word weights. 5) Important skill related words were given higher weights and overall mean of each Job description was obtained using the product for word vector and its TFIDF scores 6) Cosine Similarity was used get the similarities of the Job Description and the Resume 7) Various Natural Language Processing Techniques were identified to suggest on the improvements in the resume that could help increase the match score

    Language:Python16610384
  • charlesdedampierre/BunkaTopics

    🗺️ Data Cleaning and Textual Data Visualization 🗺️

    Language:Python1563514
  • osainz59/Ask2Transformers

    A Framework for Textual Entailment based Zero Shot text classification

    Language:Python15561215
  • yuewang-cuhk/TAKG

    The official implementation of ACL 2019 paper "Topic-Aware Neural Keyphrase Generation for Social Media Language"

    Language:Python15441433