webscraping-letterwho

web scraping doctor who content for letterwho

Run the project

Clone this repository

git clone https://github.com/K0rgana/webscraping-letterwho.git

Create & activate the python virtual environment

python -m venv venv # create py virtual environment
.\venv\Scripts\activate.ps1 #activate virtual environment on windows

Install dependencies
```
pip install -r requirements.txt
```

Edit the variable "u_range" (line 71) of the file "bigfinish.com/bigfinish/spiders/bf_stories.py" to the url of choice or select one range from the object ranges_hub (see on line 7).

#bigfinish.com>bigfinish>spiders>bf_stories.py

# See who the url is divided:
#              |           base         |      range       | page |           filters
#urlExemple = 'https://www.bigfinish.com/ranges/v/torchwood/page:{}?url=ranges/v/torchwood&sort_ordering=date_asc'

#PASTE INSIDE THE ' ' AND AFTER THE "/v/" THE RANGE NAME/URL FROM BIG FINISH SITE
u_range= 'ranges/v/torchwood' # Before
u_range= 'ranges/v/doctor-who---companion-chronicles' # After

#OR
u_range= ranges_hub["1"] # Before
u_range= ranges_hub["42"] # After

Run the file of choice with python. This will generate a file with the data
```
python bigfinish.com/bigfinish/spiders/bf_stories.py
```