Usability issues with Datasets and Notebooks
Opened this issue · 2 comments
ckadner commented
Describe the bug
It is not clear how to run notebooks and datasets? There should be a succinct documentation of ...
- How are notebooks and datasets related?
- From a data users can navigate to a notebook, but not vice versa?!
- Need to describe why running a dataset asks for a namespace (-> PVC creation)
- What happens after a Dataset has been "launched"?
- Running a dataset with related asset (notebook) asks for PVC? How does a user get that?
- What should the Mount Path be? (->
/tmp/data
... MLX UI should prefill) - Why does the pipeline run show no inputs and outputs?
- What are the results of a Notebook run, where can they be found (Minio > HTML, original notebook updated, preview in MLX UI Notebook card -- unless cached?)
- Download Notebook, after run, should contain updated notebook and/or HTML of last run
- Add a Troubleshooting section, i.e. for
403
error while trying to create PVC when missing permissions to create CRDs
We should update the doc here:
https://github.com/machine-learning-exchange/mlx/blob/main/datasets/README.md#use-dataset-with-mlx-assets
Thanks @blublinsky for reporting
ckadner commented
ckadner commented
Add Troubleshooting for 403
error when trying to "Run" a Dataset:
- by default, Kubeflow cannot deploy any CRD resource on the cluster
- need to patch the cluster with:
kubectl create clusterrolebinding pipeline-runner-extend --clusterrole cluster-admin --serviceaccount=kubeflow:pipeline-runner
@yhwang -- thanks for reporting that