[EPIC] Path to Kubeflow 1.7
Jose-Matsuda opened this issue · 2 comments
1.7 has released lets do a roadmap.
- #1635
- #1637
- Investigate / plan out tickets for the crud-web-apps portion of the rebase We have made the decision to stick with updating our backend for now, as we dont want to wait.
- Plan out tickets for central dashboard
Kubeflow Central Dashboard
- StatCan/kubeflow#135
- #1716
- StatCan/aaw-private#96
- #1718
- https://github.com/StatCan/aaw-private/issues/97
- KF 1.7 Fix tests (Tests all seemed to pass so no fixing required)
- StatCan/kubeflow#143
- KF 1.7 fix workflows
- Test Central Dashboard in dev
Jupyter-apis
- StatCan/jupyter-apis#216
- StatCan/jupyter-apis#225
- StatCan/jupyter-apis#230
- StatCan/jupyter-apis#226
- StatCan/jupyter-apis#229
- StatCan/jupyter-apis#231
- StatCan/jupyter-apis#217
- StatCan/jupyter-apis#218
- StatCan/jupyter-apis#219
- #1707
- #1706
once jupyter-apis is functional
- StatCan/jupyter-apis#245
- StatCan/jupyter-apis#220
- StatCan/jupyter-apis#221
- StatCan/jupyter-apis#240
- StatCan/jupyter-apis#244
- StatCan/aaw-private#98
Test and Debugging
Other
Some of these are comments that were thought while doing rebasing
- make sure the volume and kubecost table are properly intergrated (mainly in config.ts and index-default.ts under the jupyter index component)
- tslint.json: We had this file already, but there is now a eslintrc.json. We might want to delete the file once we confirm that it works (file is deleted and the linting still works. tslint is deprecated and it is recommended to use eslint)
- karma.conf.js : Will need to confirm if the Karma tests are still useful and work. If the Files and proxies need to be more specific to our folder structures.
- Deploy either locally, or on dev (relies on kubernetes 1.24)
- #1638
- #1729
- Test and document kubeflow 1.7 on dev
- Prepare for kubeflow 1.7 to prod
- Verify why the metrics file is not working with more recent package.json fixes. We had to comment out a path
- StatCan/aaw-kubeflow-manifests#363
Decisions to take
- Do we want to get our icons back? They were removed from the form pages
- Decide if we use tensorboard (probably not)
TODO in the future (maybe 1.8)
- Investigate jupyter-apis kubeflow/kubeflow#5201 (comment). May want to deploy this on our cluster without any customizations to test things out.
- Revisit the kubecost table UI. The common Table component got updated to have filtering. We don't need filtering and pagination on the kubecost table since it only has one row at all times.
Old stuff
Crud-web-apps / Jupyter-apis Tasks
Note that we may be looking at moving to just the StatCan/kubeflow repo especially if the backend fix mentioned solves our problems / we don't need to maintain jupyter-apis. I do feel like this part of the 1.7 upgrade should wait until we have tested the possible performance enhancements, as we don't want to rebase on top of jupyter-apis
only to find out we didnt need to.
Given that the release cadence of kubeflow seems to be every half year (and they don't seem to do patch releases) we would probably need to wait until October for 1.8.0 until the proposed backend fix has been added to an official release. We can choose to just pick that pr up and add it to our kubeflow branch and test it in the meantime if we want to.
Older information
- StatCan/jupyter-apis#132
- StatCan/jupyter-apis#131 <-- may just need to take this, and then add on top of it (as in any commits / customizations made after our rebase to 1.6)
Task List
- placeholder rebase related commits 1 (basically the
fix-1.6
issues) - ...
Centraldashboard
Should probably see how much customization there was, as we may not need to break this out into an epic. There was custom work done by the team since the 1.6 rebase but I don't think it should be too bad, especially since we're moving up a minor version anyways.
Upon talking with Bryan, who did the 1.6 rebase, we've agreed that this sort of ticket would be difficult to split out into smaller subtasks, and it's possibly really just a big ticket for one person to chew through.
Older Information
PR from 1.4 -> 1.6
Also this PR from Rohan --> CVE fix.