AVPittman's Stars
tesseract-ocr/tesseract
Tesseract Open Source OCR Engine (main repository)
piskvorky/gensim
Topic Modelling for Humans
codelucas/newspaper
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
flairNLP/flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)
cjhutto/vaderSentiment
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.
ckan/ckan
CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.
sanchit-gandhi/whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
RomelTorres/alpha_vantage
A python wrapper for Alpha Vantage API for financial data.
IntersectMBO/cardano-node
The core component that is used to participate in a Cardano decentralised blockchain.
fhamborg/news-please
news-please - an integrated web crawler and information extractor for news that just works
explosion/sense2vec
🦆 Contextually-keyed word vectors
sethblack/python-seo-analyzer
An SEO tool that analyzes the structure of a site, crawls the site, count words in the body of the site and warns of any technical SEO issues.
odpi/OpenDS4All
OpenDS4All project, hosted by LF AI & Data
alpacahq/alpaca-backtrader-api
Alpaca Trading API integrated with backtrader
vivekn/sentiment
Sentiment analysis using machine learning techniques.
cltl/python-for-text-analysis
If you want to use Python for text analysis, this course is for you!
Stage-Whisper/Stage-Whisper
The main repo for Stage Whisper — a free, secure, and easy-to-use transcription app for journalists, powered by OpenAI's Whisper automatic speech recognition (ASR) machine learning models.
kotartemiy/extract-news-api
Flask code to deploy an API that pulls structured data from online news articles
fhamborg/NewsMTSC
Target-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k sentences and a state-of-the-art classification model.
MS20190155/Measuring-Corporate-Culture-Using-Machine-Learning
Code Repository for MS20190155
jananiarunachalam/Research-Paper-Summarization
Text Summarization for Research Papers
SahilChoudhary22/pdf2csv-converter
A pdf-to-csv converter written in python
ArcasProject/Arcas
A tool designed to help scraping APIs for academic articles
artano-io/artano
A Cardano NFT Marketplace
osdg-ai/osdg-data
The OSDG Community Dataset (OSDG-CD) is a public dataset of thousands of text excerpts, validated by OSDG Community Platform (OSDG-CP) citizen scientists with respect to the Sustainable Development Goals (SDGs). The dataset is updated every quarter and published on Zenodo.
IBM/Semantic-Search-for-Sustainable-Development
Semantic Search for Sustainable Development is experimental code for searching documents for text that "semantically" corresponds to any of the UN's Sustainable development goals/targets. For example, it can be used to mine the national development plan documents of a country and identify pieces of text that correspond to any of the SDGs in order to verify alignment of the plan with the SDGs.
UNStats/LOD4Stats
IFRCGo/DREF-NLP
DREF report analysis using Natural Language Processing, project with Amesto NextBridge
NervousBlakedown/NLP-progress
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
safer-ai/Countergen