jacquietran/wnblr

Convert pkg structure to store functions that retrieve data from another repo

Closed this issue · 3 comments

Background

Currently, wnblr contains available game data from the 2014-2020 seasons, inclusive. The 2021 season starts today, and I want wnblr users to be able to obtain up-to-date game stats from the current season (and hopefully for future seasons too!).

Problem

With the way this package and my scraping routine works, keeping wnblr updated with new stats as games are played would create a lot of manual and repetitive work.

Solution

  • Convert wnblr to a package of functions - the functions will retrieve data from another GitHub repo, set up to store the game stats data.
  • The repo storing game stats data can then be regularly updated as game stats are available (e.g., using GitHub Actions).

Getting to this end-state will require a bit of work, but it will improve how useful the package is and also make it easier for me to maintain.

Progress made

  • Improved the scraping routine for reproducibility and prepping for eventual scheduling
  • .rds and .csv versions of the 4 x game stats data sets are now stored in jacquietran/wnblr_data.

I have some more data cleaning to do to address #28 and #29, but we are in a good spot to release the next version of {wnblr} possibly later this week (w/c Dec 6).

Next steps

  • Revise {wnblr} to be a package of functions rather than a package of data sets! The functions will call on the data stored in jacquietran/wnblr_data
  • Work on a scheduled jobs pipeline to periodically scrape and tidy new data (daily job?)

This issue will close when the beta branch is merged into main.