My Simple approach to building web scraper using Beautiful Soup library in Python
- Scrape a Wikipedia page and record which passages need citations.
- Your web scraper should report the number of citations needed.
- Your web scraper should identify those cases AND include the relevant passage.
get_citations_needed_count(link)
takes in a url and returns an integer.get_citations_needed_report(link)
takes in a url and returns a string the string should be formatted with each citation needed on own line, in order found.
I watched a bunch of youtube videos from channels like Data Science Dojo
, Hitesh Choudhary
and made these scripts.