/webscraping-letterwho

web scraping doctor who content for letterwho

Primary LanguagePython

webscraping-letterwho

web scraping doctor who content for letterwho

Run the project

  1. Clone this repository

    git clone https://github.com/K0rgana/webscraping-letterwho.git
  2. Create & activate the python virtual environment

    python -m venv venv # create py virtual environment
    .\venv\Scripts\activate.ps1 #activate virtual environment on windows
  3. Install dependencies

    pip install -r requirements.txt
  4. Edit the variable "u_range" (line 71) of the file "bigfinish.com/bigfinish/spiders/bf_stories.py" to the url of choice or select one range from the object ranges_hub (see on line 7).

    #bigfinish.com>bigfinish>spiders>bf_stories.py
    
    # See who the url is divided:
    #              |           base         |      range       | page |           filters
    #urlExemple = 'https://www.bigfinish.com/ranges/v/torchwood/page:{}?url=ranges/v/torchwood&sort_ordering=date_asc'
    
    #PASTE INSIDE THE ' ' AND AFTER THE "/v/" THE RANGE NAME/URL FROM BIG FINISH SITE
    u_range= 'ranges/v/torchwood' # Before
    u_range= 'ranges/v/doctor-who---companion-chronicles' # After
    
    #OR
    u_range= ranges_hub["1"] # Before
    u_range= ranges_hub["42"] # After
  5. Run the file of choice with python. This will generate a file with the data

    python bigfinish.com/bigfinish/spiders/bf_stories.py