WIPACrepo/pyglidein

Ensuring working pyGlidein

Closed this issue · 0 comments

I am trying to connect two remote machines both having personal HTCondor using pyGlidein.

To start the pyglidein server at central HTCondor pool the command used is:
pyglidein_server --port 33001 --constraint "'WantPSCBridges == true'"

To start the pyglidein client at central HTCondor pool the command used is:
pyglidein_client --config=configuration.config --secrets=SECRETS_CONFIG_FILE

Logs on server:
INFO:server:condor_q
condor_q -global -autoformat:, RequestCPUs RequestMemory RequestDisk RequestGPUs -format "%s" Requirements -constraint "JobStatus =?= 1" -constraint 'WantPSCBridges == true' -allusers
INFO:tornado.access:200 POST /jsonrpc (198.248.248.24) 1.91ms
INFO:tornado.access:200 POST /jsonrpc (198.248.248.24) 1.69ms
INFO:tornado.access:200 POST /jsonrpc (198.248.248.24) 1.20ms

Logs on client:
2019-07-15 13:21:57,223 DEBUG {u'jsonrpc': u'2.0', u'result': [], u'id': 0}
2019-07-15 13:21:57,223 INFO no state, nothing to do
2019-07-15 13:21:57,272 DEBUG avg_idle_time for Cluster: 0s.
2019-07-15 13:21:57,272 DEBUG min_idle_time for Cluster: 86400000000000s.
2019-07-15 13:21:57,272 DEBUG max_idle_time for Cluster: 0s.
2019-07-15 13:21:57,280 DEBUG {u'jsonrpc': u'2.0', u'result': u'', u'id': 1}

I am trying to submit a job from central htcondor to be run on client, but the job goes to Idle state and never runs. I am not sure if it is pyglidein connectivity issue or job requirement issue. Is there a way to ensure that the pyGliedin is configured and connected properly??