ai_papers_with_code

Setup

Meet the data science cookiecutter requirements, in brief:
- Install: git-crypt and conda
- Have a Nesta AWS account configured with awscli
Run make install to configure the development environment:
- Setup the conda environment
- Configure pre-commit
- Configure metaflow to use AWS

Run python ai_papers_with_code/pipeline/data/fetch_pwc_data.py to fetch the papers with code datasets.

The data is saved in inputs/data.

Run python ai_papers_with_code/pipeline/data/fetch_arxiv.py to fetch the arXiv tables from S3.

The data is saved in inputs/data

NB this requires AWS credentials

TODO: Make this available to anyone

Run python ai_papers_with_code/pipeline/data/scrape_publications.py to scrape arXiv publications from DeepMind and OpenAI's websites

Use the getters in ai_papers_with_code/getters/getters.py to get papers with code tables.