microsoft/PlanetaryComputerExamples

label-maker-dask.ipynb example tutorial fails

benmack opened this issue · 3 comments

The https://github.com/microsoft/PlanetaryComputerExamples/blob/main/tutorials/label-maker-dask.ipynb tutorial fails for me.

When running lmj.execute_job() I get many repetition of the following warning:

distributed.worker - WARNING - Compute Failed
Function:  execute_task
args:      ((<function tile_to_label at 0x7f6dd4e45c10>, Tile(x=15550, y=12548, z=15), 'segmentation', [(<class 'dict'>, [['name', 'Roads'], ['filter', ['has', 'highway']]]), (<class 'dict'>, [['name', 'Buildings'], ['filter', ['has', 'building']]])], '[https://qa-tiles-server-dev.ds.io/services/z17/tiles/{z}/{x}/{y}.pbf](https://qa-tiles-server-dev.ds.io/services/z17/tiles/%7Bz%7D/%7Bx%7D/%7By%7D.pbf)'))
kwargs:    {}
Exception: 'SSLError(MaxRetryError(\'HTTPSConnectionPool(host=\\\'qa-tiles-server-dev.ds.io\\\', port=443): Max retries exceeded with url: /services/z17/tiles/15/15550/12548.pbf (Caused by SSLError(CertificateError("hostname \\\'qa-tiles-server-dev.ds.io\\\' doesn\\\'t match either of \\\'*.azure-api.net\\\', \\\'*.portal.azure-api.net\\\', \\\'*.management.azure-api.net\\\', \\\'*.scm.azure-api.net\\\', \\\'*.configuration.azure-api.net\\\', \\\'*.regional.azure-api.net\\\', \\\'*.developer.azure-api.net\\\', \\\'*.data.azure-api.net\\\'")))\'))'

Followed by an SSL error

---------------------------------------------------------------------------
SSLError                                  Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 lmj.execute_job()

File /srv/conda/envs/notebook/lib/python3.8/site-packages/label_maker_dask/main.py:108, in LabelMakerJob.execute_job(self)
    106 def execute_job(self):
    107     """compute the labels and images"""
--> 108     self.results = dask.compute(*self.tasks)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/dask/base.py:573, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
    570     keys.append(x.__dask_keys__())
    571     postcomputes.append(x.__dask_postcompute__())
--> 573 results = schedule(dsk, keys, **kwargs)
    574 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/client.py:3010, in Client.get(self, dsk, keys, workers, allow_other_workers, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs)
   3008         should_rejoin = False
   3009 try:
-> 3010     results = self.gather(packed, asynchronous=asynchronous, direct=direct)
   3011 finally:
   3012     for f in futures.values():

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/client.py:2162, in Client.gather(self, futures, errors, direct, asynchronous)
   2160 else:
   2161     local_worker = None
-> 2162 return self.sync(
   2163     self._gather,
   2164     futures,
   2165     errors=errors,
   2166     direct=direct,
   2167     local_worker=local_worker,
   2168     asynchronous=asynchronous,
   2169 )

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/utils.py:311, in SyncMethodMixin.sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
    309     return future
    310 else:
--> 311     return sync(
    312         self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
    313     )

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/utils.py:378, in sync(loop, func, callback_timeout, *args, **kwargs)
    376 if error:
    377     typ, exc, tb = error
--> 378     raise exc.with_traceback(tb)
    379 else:
    380     return result

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/utils.py:351, in sync.<locals>.f()
    349         future = asyncio.wait_for(future, callback_timeout)
    350     future = asyncio.ensure_future(future)
--> 351     result = yield future
    352 except Exception:
    353     error = sys.exc_info()

File /srv/conda/envs/notebook/lib/python3.8/site-packages/tornado/gen.py:762, in Runner.run(self)
    759 exc_info = None
    761 try:
--> 762     value = future.result()
    763 except Exception:
    764     exc_info = sys.exc_info()

File /srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/client.py:2025, in Client._gather(self, futures, errors, direct, local_worker)
   2023         exc = CancelledError(key)
   2024     else:
-> 2025         raise exception.with_traceback(traceback)
   2026     raise exc
   2027 if errors == "skip":

File /srv/conda/envs/notebook/lib/python3.8/site-packages/label_maker_dask/main.py:38, in tile_to_label()
     22 """
     23 Parameters
     24 ------------
   (...)
     34     representing the label of the tile
     35 """
     37 url = label_source.format(x=tile.x, y=tile.y, z=tile.z)
---> 38 r = requests.get(url)
     39 r.raise_for_status()
     41 tile_data = mapbox_vector_tile.decode(r.content)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/requests/api.py:75, in get()
     64 def get(url, params=None, **kwargs):
     65     r"""Sends a GET request.
     66 
     67     :param url: URL for the new :class:`Request` object.
   (...)
     72     :rtype: requests.Response
     73     """
---> 75     return request('get', url, params=params, **kwargs)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/requests/api.py:61, in request()
     57 # By using the 'with' statement we are sure the session is closed, thus we
     58 # avoid leaving sockets open which can trigger a ResourceWarning in some
     59 # cases, and look like a memory leak in others.
     60 with sessions.Session() as session:
---> 61     return session.request(method=method, url=url, **kwargs)

File /srv/conda/envs/notebook/lib/python3.8/site-packages/requests/sessions.py:529, in request()
    524 send_kwargs = {
    525     'timeout': timeout,
    526     'allow_redirects': allow_redirects,
    527 }
    528 send_kwargs.update(settings)
--> 529 resp = self.send(prep, **send_kwargs)
    531 return resp

File /srv/conda/envs/notebook/lib/python3.8/site-packages/requests/sessions.py:645, in send()
    642 start = preferred_clock()
    644 # Send the request
--> 645 r = adapter.send(request, **kwargs)
    647 # Total elapsed time of the request (approximately)
    648 elapsed = preferred_clock() - start

File /srv/conda/envs/notebook/lib/python3.8/site-packages/requests/adapters.py:517, in send()
    513         raise ProxyError(e, request=request)
    515     if isinstance(e.reason, _SSLError):
    516         # This branch is for urllib3 v1.22 and later.
--> 517         raise SSLError(e, request=request)
    519     raise ConnectionError(e, request=request)
    521 except ClosedPoolError as e:

SSLError: None: Max retries exceeded with url: /services/z17/tiles/15/15550/12548.pbf (Caused by None)

Looks like the label_source does not work as suggested in the notebook.

cc @drewbo. Do you have any insight into the qa-tiles-server-dev.ds.io tile server?

Hi @TomAugspurger, sorry I don't have insight into the state of that server (I've transitioned to a new role). cc: @geohacker @batpad for more info

Thanks for the ping @drewbo! @TomAugspurger I'm not sure what the state of that server is, sorry! We noticed this on PEARL as well and I assumed that you guys must have cleaned it up those dev stacks. I think I may have also lost access to the PC devops account.

If the stack wasn't deleted, is it possible for you to paste the logs here? In the meantime, let me try and get access to my devops account.

Update: I tried to login to Azure Devops but got a 401 😭