web scraping doctor who content for letterwho
-
Clone this repository
git clone https://github.com/K0rgana/webscraping-letterwho.git
-
Create & activate the python virtual environment
python -m venv venv # create py virtual environment .\venv\Scripts\activate.ps1 #activate virtual environment on windows
-
Install dependencies
pip install -r requirements.txt
-
Edit the variable "u_range" (line 71) of the file "bigfinish.com/bigfinish/spiders/bf_stories.py" to the url of choice or select one range from the object ranges_hub (see on line 7).
#bigfinish.com>bigfinish>spiders>bf_stories.py # See who the url is divided: # | base | range | page | filters #urlExemple = 'https://www.bigfinish.com/ranges/v/torchwood/page:{}?url=ranges/v/torchwood&sort_ordering=date_asc' #PASTE INSIDE THE ' ' AND AFTER THE "/v/" THE RANGE NAME/URL FROM BIG FINISH SITE u_range= 'ranges/v/torchwood' # Before u_range= 'ranges/v/doctor-who---companion-chronicles' # After #OR u_range= ranges_hub["1"] # Before u_range= ranges_hub["42"] # After
-
Run the file of choice with python. This will generate a file with the data
python bigfinish.com/bigfinish/spiders/bf_stories.py