UBC-MDS/aridanalysis_py

Milestone 2 Meeting Minutes

Closed this issue · 2 comments

March 1st 2021 Milestone 2 Kickoff:

  • Make sure Python package is usable by end of milestone
  • Revise functions first
  • Write tests before functions
  • Iterate functions/tests
  • Neel/Daniel to set up R repo
  • Wait until lecture before assigning all issues
  • Aim to have Python test and function beta for Thursday
  • Create the R repo structure first before working on R tasks
  • We should continue to have early group kickoff meetings
  • Update issues with notes when creating PRs

March 4th 2021 Lab Discussion:

  • During function revision and discussion with Tiffany, we came to the following clarification on function purposes:
    • Our regression functions will train sklearn models and report important statsmodel model values
    • Not enough time for PostLasso inference
    • We do not have to create a completely unique package
  • New high-level directive: Create wrapper functions that produce a fit sklearn model and an output of statistical model and feature summaries from statsmodel model.
  • Model summary metrics to report:
    • N observations
    • Loss function used
    • Adjusted R-squared
    • F-test probability
  • Report feature coefficient summary table as a dataframe
  • Add a missing data plot to the EDA function

March 5th 2021 Group Check-in Meeting:

  • Use error_string constants file to keep DRY
  • Change data_frame inputs to df
  • arid_linreg() is now outputting an sklearn and statsmodel models and outputting some summary info
  • Remember that commit messages on R repo are subject to grading
  • Make sure to use GitHub Flow workflow on R repo; always branch and create PRs from-to main
  • R package will create an sklearn interface in R environment.