/PubMed_web_scraping

can we build a python(web scraping) at PubMed so that we can provide a direct option to bulk-extract literature texts into an Excel file?

Primary LanguageJupyter Notebook

PubMed_web_scraping☃️

🙌 An idea struck me: "Can we create a Python script for web scraping PubMed to bulk extract literature into an Excel file?" I fulfilled this goal. 🥳

⭐️The file named "link-pubmed.ipynb" contains a great idea that is almost perfect. The last column in the Excel sheet produced by Python will display the PubMed ID (PMID) instead of a hyperlink. But don't worry, you can convert the PMID to a hyperlink format in PubMed for reviewing the literature's homepage. For instance, use the following format: "https://pubmed.ncbi.nlm.nih.gov/32611469/" (where 32611469 is the PMID, feel free to replace it with your PMID).

*Quick tips: you can set a function in the F2 cell with "=HYPERLINK("https://pubmed.ncbi.nlm.nih.gov" & E2)" and drag this cell to the end of the F column so that you can name a new column have hyper address based on column E.

⭐️Fortunately, the file named "python_address.ipynb" can contribute to an Excel with a "title and hyper address for literature"

⭐️In these two files, my search string in PubMed is (ex-gaussian[Title/Abstract]) AND (reaction time[Title/Abstract]). And, for the web scraping, I scraped 60 results, which means that the first 6 pages in the Pubmed result. Please feel free to replace the search words/pages when you try to make your search.

🌻Thank you for your review. Merry Xmas🫡 2023