CTLungCancerWeb

Source 1:

  • Data source: NCBI
  • Crawler tool use: Scrapy
  • Directory crawler/multiplesources/one/spiders/

Source 2:

  • Data source: Wikimedia
  • Crawler tool use: Scrapy
  • Directory crawler/multiplesources/one/spiders/

Source 3:

  • Data source: Yandex
  • Crawler tool use: Scrapy
  • Directory crawler/multiplesources/one/spiders/

Source 4:

  • Data source: ERJ
  • Crawler tool use: Selenium
  • Directory crawler/ERJ

Source 5:

Source 6:

  • Data source: DuckDuckGo
  • Crawler tool use: Selenium
  • Directory crawler/duckduckdo

Source 7:

Source 8:

  • Data source: PMC
  • Crawler tool use: Selenium
  • Directory crawler/PMC

csv_to_json:

To use this, run command

python csv_to_json.py [csvFilePath] [jsonFilePath]