databrickslabs/databricks-sync

Error exporting identity

coreystokes-8451 opened this issue · 10 comments

Hello,

I receive the following error when exporting identity:

Traceback (most recent call last):
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/sync/export.py", line 74, in export
    exp.run()
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/pipeline.py", line 595, in run
    self.__generate_all()
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/pipeline.py", line 584, in __generate_all
    loop.run_until_complete(groups)
  File "/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/pipeline.py", line 87, in trigger
    async for item in self.generate():
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/pipeline.py", line 92, in generate
    async for item in self._generate():
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/generators/identity.py", line 477, in _generate
    service_principals_data[id_] = self.get_service_principal_dict(service_principal)
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/generators/identity.py", line 412, in get_service_principal_dict
    ServicePrincipalSchema.DISPLAY_NAME: sp["displayName"],
KeyError: 'displayName'

If i remove comment out line 412 in identity.py the export will succeed.

hey @coreystokes-8451 thanks for reporting this error, did you have a service principal in your account by any chance? and what cloud are you operating on?

@stikkireddy I checked and I see there are a few service principals inside the users / admin groups that dont exist in our tenant. We use Azure for our cloud. Im going to try and clean those up first and report back.

UPDATE - Doing a call to the GET - Service principals azure databricks api endpoint doesnt return a "displayName" if thats any help

@coreystokes-8451 🤣 eng said displayName is required let me dig in and get back to you asap

so the terraform provider requires a displayName, hmm if displayName doesnt exist, would moving the applicationId to the displayName work for you?

@stikkireddy that would work

@coreystokes-8451 can you try this:
pip install git+https://github.com/databrickslabs/databricks-sync@patch-sp-missing-display-name

Unfortunately i cant seem to create a service principal on my end without a displayname, i dont have access to create fresh new service principals either. If that works for you, i will create new 0.2.2 release for this fix.

@stikkireddy This worked ! But i do get the following error when having the --dask flag set:

Traceback (most recent call last):
  File "/Users/c501854/sanbox/databrick_sync/venv/bin/databricks-sync", line 8, in <module>
    sys.exit(cli())
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_cli/configure/config.py", line 55, in decorator
    return function(*args, **kwargs)
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/cmds/config.py", line 178, in modify_user_agent
    return function(*args, **kwargs)
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/cmds/config.py", line 161, in decorator
    return function(*args, **kwargs)
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/cmds/export.py", line 33, in export_cli
    ExportCoordinator.export(api_client, Path(config_path), dask_mode=dask, dry_run=dry_run, git_ssh_url=git_ssh_url,
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/databricks_sync/sdk/sync/export.py", line 46, in export
    from distributed import Client
  File "/Users/c501854/sanbox/databrick_sync/venv/lib/python3.8/site-packages/distributed/__init__.py", line 3, in <module>

@coreystokes-8451 apologies for the delay, it seems we have updated dask but not distributed and streamz. Can you please try again with the same branch I pushed up the changes.

@stikkireddy Sorry for the delay, this is working correctly now

awesome then closing this issue!