Pinned Repositories
19th_C._novel_scraper
scrapes epigraphs and metadata from xml/tei-encoded texts
allennlp
An open-source NLP research library, built on PyTorch.
arxiv-public-datasets
A set of scripts to grab public datasets from resources related to arXiv
citation.js
Citation.js converts formats like BibTeX, Wikidata JSON and ContentMine JSON to CSL-JSON to convert to other formats like APA, Vancouver and back to BibTeX.
constellate-notebooks
Example notebooks and tutorials from Constellate, the text analysis service from ITHAKA.
DETM
DH2020
The Cartography of DH2020 is based on an innovative visual method to explore conference speakers. In a moment in which conferences went online, it is needed to reinvent the way in which we explore public events.
doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
dynamic_bernoulli_embeddings
OCR_PDFs
a python3/jupyter script using ocrmypdf & tesseract to batch process all PDFs in a directory and all its subdirectories
aaronplasek's Repositories
aaronplasek/OCR_PDFs
a python3/jupyter script using ocrmypdf & tesseract to batch process all PDFs in a directory and all its subdirectories
aaronplasek/19th_C._novel_scraper
scrapes epigraphs and metadata from xml/tei-encoded texts
aaronplasek/allennlp
An open-source NLP research library, built on PyTorch.
aaronplasek/arxiv-public-datasets
A set of scripts to grab public datasets from resources related to arXiv
aaronplasek/citation.js
Citation.js converts formats like BibTeX, Wikidata JSON and ContentMine JSON to CSL-JSON to convert to other formats like APA, Vancouver and back to BibTeX.
aaronplasek/constellate-notebooks
Example notebooks and tutorials from Constellate, the text analysis service from ITHAKA.
aaronplasek/DETM
aaronplasek/DH2020
The Cartography of DH2020 is based on an innovative visual method to explore conference speakers. In a moment in which conferences went online, it is needed to reinvent the way in which we explore public events.
aaronplasek/doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
aaronplasek/dynamic_bernoulli_embeddings
aaronplasek/Epidemiology101
Epidemic Modeling for Everyone
aaronplasek/kraken
OCR engine for all the languages
aaronplasek/github-project-management-example
Example of how to use GitHub for project management.
aaronplasek/lime
Lime: Explaining the predictions of any machine learning classifier
aaronplasek/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
aaronplasek/Origins_and_Meaning_Reading_List
An expanded and annotated reading list for the Fall 2021 "Origins & Meaning" course
aaronplasek/python-intro-course