/wikiscrape

Repo for airport data scraped fom wikipedia

Primary LanguageJavaScript

wikiscrape

Steps to run:

  • clone the repository
  • Do npm i to install all the required modules
  • npm start to run the scraper.

Modules Used:

TODOs

  • Go through each collection of airports by clicking A Z letters You have to click through the buttons on the page and not manipulate URLs + use page.goto().
  • After navigating to a collection of airports, scrape IATA, ICAO, Airport Name and Location Served columns for each of the airport.
  • Convert Airport Name and Location Served to lower case, replace whitespaces and commas with ‘_’ (underscore). For all columns substitute numbers with string ‘DAPI’. Example of input-output: “Ana1 Airport, Set” → “anadapi_airport__set”
  • Save all scraped information into a JSON file.