Convert pkg structure to store functions that retrieve data from another repo
Closed this issue · 3 comments
Background
Currently, wnblr
contains available game data from the 2014-2020 seasons, inclusive. The 2021 season starts today, and I want wnblr
users to be able to obtain up-to-date game stats from the current season (and hopefully for future seasons too!).
Problem
With the way this package and my scraping routine works, keeping wnblr
updated with new stats as games are played would create a lot of manual and repetitive work.
Solution
- Convert
wnblr
to a package of functions - the functions will retrieve data from another GitHub repo, set up to store the game stats data. - The repo storing game stats data can then be regularly updated as game stats are available (e.g., using GitHub Actions).
Getting to this end-state will require a bit of work, but it will improve how useful the package is and also make it easier for me to maintain.
The Twitter and streameR Discord hive minds suggested these reference points / resources:
Progress made
- Improved the scraping routine for reproducibility and prepping for eventual scheduling
- .rds and .csv versions of the 4 x game stats data sets are now stored in jacquietran/wnblr_data.
I have some more data cleaning to do to address #28 and #29, but we are in a good spot to release the next version of {wnblr} possibly later this week (w/c Dec 6).
Next steps
- Revise {wnblr} to be a package of functions rather than a package of data sets! The functions will call on the data stored in jacquietran/wnblr_data
- Work on a scheduled jobs pipeline to periodically scrape and tidy new data (daily job?)
This issue will close when the beta branch is merged into main.