hai-cli使用报错
Closed this issue · 1 comments
yolunghiu commented
- 搭建集群在本地部署hai-platform成功,状态如下
(base) ➜ ~ k get pod -n hai-platform
NAME READY STATUS RESTARTS AGE
hai-platform-0 1/1 Running 0 20h
(base) ➜ ~ k get svc -n hai-platform
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hai-platform-svc LoadBalancer 10.68.214.177 192.168.1.201 5432:30461/TCP,6379:30977/TCP,80:30599/TCP,8080:30411/TCP 20h
(base) ➜ ~ k get node
NAME STATUS ROLES AGE VERSION
master Ready,SchedulingDisabled master 44h v1.29.0
worker Ready node 44h v1.29.0
- hai-cli初始化看起来也成功
- 平台部署后,user表中有两条记录:haiadmin, bff_admin,但user_access_token表为空
- 参考这个issue,在user_access_token表中插入了一条haiadmin的记录:#5 (comment)
- 初始化效果
(hai) ark@zero:~/code/hai/hai-platform$ hai-cli init ACCESS-68516961646d696e2368616961646d696e-E0lGXwIswnn0HpbXAW_tVRjga1wRjD0u --url http://192.168.1.201
初始化成功, 目标配置 /home/ark/.hfai/conf.yml, 配置如下:
token: ACCESS-68516961646d696e2368616961646d696e-E0lGXwIswnn0HpbXAW_tVRjga1wRjD0u
url: http://192.168.1.201
- 提交任务报错
(hai) ark@zero:~/code/hai/hai-platform$ hai-cli python /nfsroot/hai-platform/workspace/haiadmin/test.py -- -n 1
WARNING: 提交的任务将会继承当前环境 ,有可能造成环境不兼容,如不想继承当前环境请添加参数 --no_inherit
提交任务成功,定义如下
--------------------------------------------------------------------------------
name: test.py
priority: 30
resource:
group: default
image: default
node_count: 1
spec:
entrypoint: test.py
parameters: ''
workspace: /nfsroot/hai-platform/workspace/haiadmin
version: 2
--------------------------------------------------------------------------------
Traceback (most recent call last):
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/hfai/client/api/api_utils.py", line 101, in async_requests
result = json.loads(result)
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ark/Program/anaconda3/envs/hai/bin/hai-cli", line 9, in <module>
sys.exit(cli())
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/asyncclick/core.py", line 1159, in __call__
return anyio.run(self._main, main, args, kwargs, **({"backend":_anyio_backend} if _anyio_backend is not None else {}))
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/anyio/_core/_eventloop.py", line 68, in run
return asynclib.run(func, *args, **backend_options)
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 204, in run
return native_run(wrapper(), debug=debug)
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/asyncio/base_events.py", line 608, in run_until_complete
return future.result()
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 199, in wrapper
return await func(*args)
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/asyncclick/core.py", line 1162, in _main
return await main(*args, **kwargs)
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/asyncclick/core.py", line 1083, in main
rv = await self.invoke(ctx)
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/asyncclick/core.py", line 1693, in invoke
return await _process_result(await sub_ctx.command.invoke(sub_ctx))
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/asyncclick/core.py", line 1429, in invoke
return await ctx.invoke(self.callback, **ctx.params)
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/asyncclick/core.py", line 783, in invoke
rv = await rv
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/hfai/client/commands/hfai_python.py", line 294, in python
await func_python_cluster(experiment_py, experiment_args, name, nodes, priority, group, image, environments,
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/hfai/client/commands/hfai_python.py", line 255, in func_python_cluster
await hfai_experiment.run.callback(config, follow, None, None, None)
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/hfai/client/commands/hfai_experiment.py", line 167, in run
experiment = await create_experiment(experiment_yml)
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/hfai/client/api/experiment_api.py", line 444, in create_experiment
result = await async_requests(RequestMethod.POST, url=f'{mars_url()}/operating/task/create?token={token}',
File "/home/ark/Program/anaconda3/envs/hai/lib/python3.8/site-packages/hfai/client/api/api_utils.py", line 116, in async_requests
raise Exception(f'请求失败: [exception: {str(e)}] [result: {result}]')
Exception: 请求失败: [exception: Expecting value: line 1 column 1 (char 0)] [result: Internal Server Error]
yolunghiu commented