/Domain-specific-data-collection-from-structured-and-unstructured-sources

Data collection (scraping+dynamic crawling) for domain "Computer Scientists" from 13 websites include Wikipedia, Google Scholar, DBLP etc and merging them to create a high quality tabular dataset.

Primary LanguageJupyter Notebook

This repository is not active