titipata/pubmed_parser

Question about downloading PubMed OA figures

pidugusundeep opened this issue · 6 comments

After running the Parse PubMed OA images and captions, I would like to understand where I can get the actual figure to download with the fig_id or graphic_ref attached in the nxml document.

@pidugusundeep I tag @daniel-acuna here. Do you know where can we download the dataset?

Search Pubmed Open Access FTP

I was able to download the entire 'txt', but I need to get the source images for all of them, Please provide me with the sample. @daniel-acuna

For example, if ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/00/00/PMC3363814.tar.gz is uncompressed, you will find the figures associated with the paper. All the paths are available in here ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_file_list.csv

Nice, thank you so much Daniel. I'll put the documentation in the repo before closing this issue.

I put instructions on how to download figures here. For someone who see this issue, see more details on Wiki page.