authzed/authzed-node

Random hangs with grpc-js version later than 1.6.9

Closed this issue · 4 comments

After upgrading to v0.12.1 (after previously running 0.10.0) I have noticed random spikes for various calls (CheckPermission, LookupResources). Average response time for the queries being ran is <50ms; however - the spikes result in responses which are >60000ms (for a simple CheckPermission).

Environment:

  • NodeJS v18.7.1
  • NestJS v10
  • SpiceDB 1.24.0

Troubleshooting steps undertaken:

  • Upgrading my SpiceDB environment from 1.20.0 to 1.24.0 (persists)
  • Checked MySQL for slow queries (no success)
  • Adding instrumentation/perf measurement over my application to determine where the performance issue lies (the one line 'checkPermission' client call)
  • As a result of this; I downgraded back to 0.10.0 (known good for me) but the issue persisted
  • After researching the changes made; I realised I'd reset my package-lock.json, which despite version pinning authzed-node, resulted in the grpc-js dependency being resolved to a newer version than I was previously running.
  • Reverting @grpc/grpc-js in my package-lock.json to 1.6.9 seems (at this point) to stop the issue occurring

Conclusion: grpc-js > 1.6.9 is resulting in random performance spikes.

We experienced the same problem with grpc-js version 1.9.1 and @authzed/authzed-node 0.9.1. We didn't find any clear pattern in which requests were slow and which ones weren't. Downgrading to grpc-js version 1.6.9 fixed the problem, though we didn't try any of the versions of grpc-js between 1.6.9 and 1.9.1.
Thanks @brentpi for reporting the fix!

@mpauly-exnaton

Further update; I've since upgraded my infrastructure to SpiceDB 1.27.0 and now run:

  • authzed-node ^0.13.0 (resolved to 0.13.0)
  • grpc/grpc-js ~1.9.1 (resolved to 1.9.9)

I dont seem to have random connection resets; nor do I have excessively long permission resolution times (I've since implemented metrics on the app side to quantify this).

@brentpi Thanks for the heads up! I've upgraded our dependencies end of last week, and it seems that so far the problems also haven't resurfaced for us. If they reappear I'll post again.

Given that it's been a few months since these have manifested, I'm going to go ahead and close this issue. Please let us know if you run into this again, though!