graphistry/pygraphistry

[FEA] First-class client support for easier multitenancy

Opened this issue · 0 comments

Is your feature request related to a problem? Please describe.

Currently, multitenant apps need to lock-and-register at .plot() to ensure the right credentials are used at upload time

It'd be easier if we had some sort of first-class functional client model where a client can be configured, with its own creds, and as long as the right client is used, the plot works

Describe the solution you'd like

Internal base interface Client

  • Mostly around JWT & org manipulations
  • g.client() : Client , g._client.token(), g._client.refresh(), g._client.switch(), etc
  • Variants: ClientUnconfigured, ClientUserPass, ClientAPIKey, ClientSSO, ...

Global and root chain clients

Recall that g objects functionally build up, like g2 = g1.edges(df)

Each plottable will either set its client, or a reference to another plottable and use its dynamically resolvable client, to be dynamically determined upon use (e.g., plot). For convenience, the client will be set for legacy convenient global notebook cases + more controllable multitenant scenarios:

  1. Backwards compatible global/single-tenant: for notebook etc users + legacy
  • module has a single global client
  • graphistry.register(); g = graphistry.bind(...); g.plot():
    • top-level graphistry.register() mutates global client
    • by default, chained items use the latest global client
  • add graphistry.register(client=...)
    • potentially deprecate old access flows
    • modifying the client c bound in client=c impacts all plottables using it
  • access via c = graphistry.client_root() and graphistry.client_root(c)
  1. New: Custom chainable roots, for multitenant
  • Generate a top-level g via explicit client instantiation

    • g = ClientUserPass().g()
    • g = graphistry.client(ClientUserPass())
  • Mid-chain, setting a new root that impacts all downstream, and optionally the current

    • g2 = g1.client(Client(...)) <- g2 uses a diff client
    • g1.set_client(Client(..) <- update g1's client, and also downstream like g2

Question

Default behavior during chaining: I can imagine g.copy() chaining to automatically do one of these:
* g2._client_root = g1 # so changing g1 will change g2
* g2._client_root = g1._client_root # so changing g1's root will change g2's, but changing g1 will not (unless it has the root)

Basically, is:

  • g._client_root : Client, so by ref eq of a client set at time of chain
  • g._client_root : Union[Plottable, Client], so when Plottable, a plot-time dynamic lookup of a parent plottable
    • and is that parent plottable g1? so requiring a traversal as deep as the functional chain
    • ... or g1._client_root, which short-cuts the chain to probably just 1-2 hops, so just explicit client/set_client points at time of creation?

Examples

  1. Legacy notebooks: Change nothing!
graphistry.register(...)
g.plot()
graphistry.register(... switch org ...)
g.plot()
  1. New tenant notebooks: Refactor to use clients
c = graphistry.ClientUserPass()
graphistry.register(c)
g.plot()
graphistry.client_root().switch_org('new org')
g.plot()
  1. Multitenant apps: Create new client roots
g1 = graphistry.client(graphistry.ClientUserPass(...))
g2 = g1.edges(...)
g1.get_client().switch_org(..) # same effect
  1. Multitenannt apps: Late decision making

No need to track and work at the initial g root level

g1 = graphistry.client(graphistry.ClientUserPass(...))
g2 = g1.edges(...)
g2.set_client(c2)
g2.plot()
g3 = g2.client(c3)
g3.plot()
g2.get_client().switch(...)
g3.plot()

Rollout

Initially, we can likely target something like:

Phase 1: Simple standalone clients

  • Clients: Start with just user/pass and api key, for simplicity
  • Roots: Do both root + downstream modes, so flexibility in when setting new tenants
  • Register interaction: minimize
    • Do not yet refactor register(...) params to use clients
    • When client mode is detected (e.g., existence of clients), ignore the register flow
      • Throw an Unsupported warning when detect mixing new vs old modes

Phase 2: SSO

  • Add SSO client support

Phase 3: Backport

  • Refactor old register code paths to use new ones
  • Probably as a sequence of PRs, like userpass/api/jwt first, and then sso after
  • Decide whether to deprecate old calling forms, keep current form as nice sugar, or give warnings if this reveals ambiguities

Describe alternatives you've considered

We may also be able to do something simple like overload plot():

c = ClientXYZ(..)
g.plot(client=c)

But I rather not complicate plot, and move to something more decoupled, like via locks, contexts, or these _client objects

with ClientXYZ(...) as c:
  g.plot()

Additional context

Current multitenancy workaround is a locked register() right before plot():

# create a shared global lock
from threading import Lock
global_lock = Lock()

# do normal graphistry plottable buildup as usual
g_1 = graphistry.edges(...)...
...
g_n = g_1...

# change plot() call to include a locked just-in-time register:
with global_lock:
  graphistry.register(...)
  g_n.plot()