Retrieve assessement data for a charter school network via a spider to follow hyperlinks that will download files containing testing results.
Jupyter Notebook
State Assessment Data Spider
There are three parts to creating this Spider:
1. Importing the proper packagaes:
import scrapy
from scrapy import Selector
import requests as r
from scrapy.crawler import CrawlerProcess
2. Creating the Spider:
classSpiderClassName(scrapy.Spider):
name='SpiderName'defstart_requests(self):
# code heredefparse(self, response):
#code here
3. Running Spider
process=CrawlerProcess() #--> instansiating processprocess.crawl(SpiderClassName) #--> telling process which Spider to runprocess.start() #--> running spider