A simple web scraping tool for recipe sites.
pip install recipe-scrapers
then:
from recipe_scrapers import scrape_me
# give the url as a string, it can be url from any site listed below
scraper = scrape_me('https://www.allrecipes.com/recipe/158968/spinach-and-feta-turkey-burgers/')
# Q: What if the recipe site I want to extract information from is not listed below?
# A: You can give it a try with the wild_mode option! If there is Schema/Recipe available it will work just fine.
scraper = scrape_me('https://www.feastingathome.com/tomato-risotto/', wild_mode=True)
scraper.title()
scraper.total_time()
scraper.yields()
scraper.ingredients()
scraper.instructions()
scraper.image()
scraper.host()
scraper.links()
scraper.nutrients() # if available
Notes:
- Starting from v13.0.0 the packaged stopped suppressing scraper exceptions by default. If you want the previous behaviour
import os
from recipe_scrapers import scrape_me
os.environ["RECIPE_SCRAPERS_SETTINGS"] = "recipe_scrapers.settings.v12_settings"
scraper = scrape_me(...) # etc.
scraper.links()
returns a list of dictionaries containing all of the <a> tag attributes. The attribute names are the dictionary keys.
- https://claudia.abril.com.br/
- https://www.acouplecooks.com
- http://www.afghankitchenrecipes.com/
- https://akispetretzikis.com/
- https://allrecipes.com/
- https://alltommat.se/
- https://amazingribs.com/
- https://ambitiouskitchen.com/
- https://archanaskitchen.com/
- https://www.atelierdeschefs.fr/
- https://averiecooks.com/
- https://baking-sense.com/
- https://bakingmischief.com/
- https://bbc.com/
- https://bbc.co.uk/
- https://bbcgoodfood.com/
- https://bettycrocker.com/
- https://bigoven.com/
- https://blueapron.com/
- https://bonappetit.com/
- https://bowlofdelicious.com/
- https://budgetbytes.com/
- https://castironketo.net/
- https://cdkitchen.com/
- https://chefkoch.de/
- https://closetcooking.com/
- https://comidinhasdochef.com/
- https://cookeatshare.com/
- https://cookieandkate.com/
- https://cookinglight.com/
- https://cookpad.com/
- https://cookstr.com/
- https://copykat.com/
- https://countryliving.com/
- https://cucchiaio.it/
- https://cuisineaz.com/
- https://cybercook.com.br/
- https://delish.com/
- https://www.ditchthecarbs.com/
- https://domesticate-me.com/
- https://downshiftology.com/
- https://www.dr.dk/
- https://www.eatingbirdfood.com/
- https://www.eatingwell.com/
- https://eatsmarter.com/
- https://eatsmarter.de/
- https://eatwhattonight.com/
- https://epicurious.com/
- https://recipes.farmhousedelivery.com/
- https://fifteenspatulas.com/
- https://finedininglovers.com/
- https://fitmencook.com/
- https://food.com/
- https://food52.com/
- https://foodandwine.com/
- https://foodnetwork.com/
- https://foodrepublic.com/
- https://www.forksoverknives.com/
- https://www.750g.com
- https://geniuskitchen.com/
- https://giallozafferano.it/
- https://gimmesomeoven.com/
- https://recietas.globo.com/
- https://gonnawantseconds.com/
- https://gousto.co.uk/
- https://greatbritishchefs.com/
- https://halfbakedharvest.com/
- https://www.hassanchef.com/
- https://headbangerskitchen.com/
- https://www.heb.com/
- https://heinzbrasil.com.br/
- https://hellofresh.com/
- https://hellofresh.co.uk/
- https://www.hellofresh.de/
- https://www.hellofresh.fr/
- https://www.hellofresh.nl/
- https://www.homechef.com/
- https://hostthetoast.com/
- https://receitas.ig.com.br/
- https://indianhealthyrecipes.com
- https://www.innit.com/
- https://inspiralized.com/
- https://jamieoliver.com/
- https://jimcooksfoodgood.com/
- https://joyfoodsunshine.com/
- https://justataste.com/
- https://justbento.com/
- https://www.justonecookbook.com/
- https://kennymcgovern.com/
- https://www.kingarthurbaking.com
- https://kochbar.de/
- http://koket.se/
- https://www.kptncook.com/
- https://kuchnia-domowa.pl/
- https://www.kwestiasmaku.com/
- https://www.latelierderoxane.com
- https://lecremedelacrumb.com/
- https://lekkerensimpel.com
- https://littlespicejar.com/
- http://livelytable.com/
- https://lovingitvegan.com/
- https://madensverden.dk/
- https://marmiton.org/
- https://www.marthastewart.com/
- https://matprat.no/
- https://www.melskitchencafe.com/
- http://mindmegette.hu/
- https://minimalistbaker.com/
- https://misya.info/
- https://www.mobkitchen.co.uk/
- https://momswithcrockpots.com/
- https://monsieur-cuisine.com/
- http://motherthyme.com/
- https://mybakingaddiction.com/
- https://mykitchen101.com/
- https://mykitchen101en.com/
- https://www.myplate.gov/
- https://myrecipes.com/
- https://healthyeating.nhlbi.nih.gov/
- https://nourishedbynutrition.com/
- https://nutritionbynathalie.com/blog
- https://cooking.nytimes.com/
- https://ohsheglows.com/
- https://101cookbooks.com/
- https://www.paleorunningmomma.com/
- https://www.panelinha.com.br/
- https://paninihappy.com/
- https://popsugar.com/
- https://practicalselfreliance.com/
- https://www.primaledgehealth.com/
- https://przepisy.pl/
- https://purelypope.com/
- https://purplecarrot.com/
- https://rachlmansfield.com/
- https://rainbowplantlife.com/
- https://realfood.tesco.com/
- https://realsimple.com/
- https://recipietineats.com/
- https://redhousespice.com/
- https://reishunger.de/
- https://rezeptwelt.de/
- https://sallysbakingaddiction.com
- https://sallys-blog.de
- https://www.saveur.com/
- https://seriouseats.com/
- https://simplyquinoa.com/
- https://simplyrecipes.com/
- https://simplywhisked.com/
- https://skinnytaste.com/
- https://southernliving.com/
- https://spendwithpennies.com/
- https://www.springlane.de
- https://steamykitchen.com/
- https://streetkitchen.hu/
- https://sunbasket.com/
- https://sundpaabudget.dk/
- https://sweetcsdesigns.com/
- https://sweetpeasandsaffron.com/
- https://tasteofhome.com
- https://tastesoflizzyt.com
- https://tasty.co
- https://tastykitchen.com/
- https://theclevercarrot.com/
- https://thehappyfoodie.co.uk/
- https://www.thekitchenmagpie.com/
- https://thekitchn.com/
- https://thenutritiouskitchen.co/
- https://thepioneerwoman.com/
- https://thespruceeats.com/
- https://thevintagemixer.com/
- https://thewoksoflife.com/
- https://timesofindia.com/
- https://tine.no/
- https://tudogostoso.com.br/
- https://twopeasandtheirpod.com/
- https://www.valdemarsro.dk/
- https://vanillaandbean.com/
- https://vegolosi.it/
- https://vegrecipesofindia.com/
- https://watchwhatueat.com/
- https://whatsgabycooking.com/
- https://www.wholefoodsmarket.com/
- https://www.wholefoodsmarket.co.uk/
- https://woop.co.nz/
- https://woolworths.com.au/shop/recipes
- https://en.wikibooks.org/
- https://yemek.com/
- https://yummly.com/
- https://zeit.de/ (wochenmarkt)
- https://zenbelly.com/
Part of the reason I want this open sourced is because if a site makes a design change, the scraper for it should be modified.
If you spot a design change (or something else) that makes the scraper unable to work for a given site - please fire an issue asap.
If you are programmer PRs with fixes are warmly welcomed and acknowledged with a virtual beer.
Open an Issue providing us the site name, as well as a recipe link from it.
You are a developer and want to code the scraper on your own:
If Schema is available on the site - you can do this
Otherwise, scrape the HTML - like this
Generating a new scraper class:
python generate.py <ClassName> <URL>
- ClassName: The name of the new scraper class.
- URL: The URL of an example recipe from the target site. The content will be stored in test_data to be used with the test class.
Assuming you have >=python3.7
installed, navigate to the directory where you want this project to live in and drop these lines
git clone git@github.com:hhursev/recipe-scrapers.git &&
cd recipe-scrapers &&
python3 -m venv .venv &&
source .venv/bin/activate &&
pip install -r requirements-dev.txt &&
pre-commit install &&
python run_tests.py
In case you want to run a single unittest for a newly developed scraper
python -m coverage run -m unittest tests.test_myscraper
- How do I know if a website has a Recipe Schema? Run in python shell:
from recipe_scrapers import scrape_me
scraper = scrape_me('<url of a recipe from the site>', wild_mode=True)
# if no error is raised - there's schema available:
scraper.title()
scraper.instructions() # etc.
All the contributors that helped improving the package. You are awesome!