Pinned issues
Issues
- 2
OSError when there are too many concurrent processes
#5730 opened by siaimes - 0
If I want to use openpai to adapt to npu hardware, how should I configure it?
#5813 opened by fjq123123 - 0
Feedback v1.8.0
#5810 opened by hlkzh - 1
Can not receive job status change message
#5809 opened by HaoLiuHust - 0
Support for different hardware configurations for different task roles of one distributed job.
#5808 opened by siaimes - 3
Prebuilt docker image for aarch64
#5807 opened by huww98 - 2
can I use external website as webportal plugin?
#5771 opened by HaoLiuHust - 1
- 5
- 0
How to deploy custom plugins in the webprortal?
#5804 opened by hlyf-xs - 0
FAILED - RETRYING: ensure docker packages are installed.
#5796 opened by cn-bp - 5
- 1
Can't Install Docker Image
#5794 opened by brianjsl - 1
Failing to Join to Cluster during deployment
#5792 opened by brianjsl - 5
- 2
I followed the documentation to update the certificate and the cluster crashed.
#5787 opened by siaimes - 2
- 10
- 1
Command 'PAI: Add PAI Cluster' resulted in an error (command 'paiext.cluster.add' not found)
#5780 opened by zlwzlwzlw - 1
Uninstall Pai service but interrupted by 'nvidia-device-plugin-daemonset'
#5781 opened by 18AlexHua18 - 0
Feedback v1.7.0
#5785 opened by Strive-for-excellence - 4
Every one can clone other's job, seems not safe
#5770 opened by HaoLiuHust - 2
Can't redirect to k8s dashboard management page
#5779 opened by Ivens-Zhang - 4
Samba / NFS Integration Storage-Manager Documentation needed / Behavior infuse Security Problem
#5772 opened by lkaupp - 0
Upgrade nodeJS version for PAI services
#5775 opened by Binyang2014 - 2
stdout/stderr: Log folder can not be retrieved
#5765 opened by wangxianglang - 0
NoJobError happens when use rest-api to submit a job
#5769 opened by HaoLiuHust - 1
- 1
Failed create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod
#5766 opened by HaoLiuHust - 1
Problems about OpenPAI running without Internet
#5679 opened by wangxianglang - 4
how to set job-status-change-notification, can not receive job notification
#5759 opened by HaoLiuHust - 1
- 1
- 0
how to set not pull image for every job
#5744 opened by HaoLiuHust - 1
Is there a doc about restful API?
#5739 opened by HaoLiuHust - 3
How to increase memory for worker?
#5740 opened by zsh4614 - 1
Can we use api to submit job?
#5737 opened by HaoLiuHust - 1
Invalid Format: : should match exactly one schema in oneOf when use vscode extension
#5736 opened by HaoLiuHust - 0
Can I put node with different number gpu in a vc
#5735 opened by HaoLiuHust - 4
Where to login after install?
#5732 opened by HaoLiuHust - 1
how to run job on master?
#5731 opened by HaoLiuHust - 1
when restart rest-server, UnhandledPromiseRejectionWarning: Error: launcher config error
#5733 opened by HaoLiuHust - 1
unreachable error when add or delete node
#5727 opened by JohanOu - 0
error when add node
#5724 opened by JohanOu - 0
After installing,some worker nodes NotReady
#5725 opened by JohanOu - 1
[bug report] When a task was cloned, the TensorBoard port was not regenerated, so the TensorBoard could not be started.
#5721 opened by siaimes - 4
connection closed when ssh to job
#5682 opened by edenbuaa - 1
sometimes submit job failed.
#5720 opened by zsh4614 - 0
something about Hardware
#5707 opened by zsh4614 - 2
Unable to retrieve log for submitted jobs
#5677 opened by chinkit-ffc