Performance for plural endpoints is suboptimal
lesderid opened this issue · 0 comments
Some background: we use Kinto as a synced per-user data store, using a bucket per app deployment/environment, and a collection per user. Users can often have thousands of records (most fairly small, a handful per collection fairly large).
We're using postgresql
for the storage
and permission
backends (and memcached
for the cache
backend).
The performance of plural endpoints (e.g. /v1/buckets/my-app-staging/collections/my-user/records
to get all records in a collection) in the current server implementation is a bit disappointing (ignoring caching).
I've profiled the Kinto server using Sentry, by adding traces_sample_rate=1.0, _experiments={"profiles_sample_rate": 1.0}
to the sentry_sdk.init()
call. While the SQL queries themselves take a bit of time, it's also spending a considerable amount of time in library functions.
JSON deserialisation
Swapping out json.loads
for msgspec.json.decode
for SQLAlchemy's JSON deserialisation gives a substantial improvement:
Benchmark (json.loads): curl http://localhost:8888/v1/buckets/my-app-development/collections/my-user/records
Time (mean ± σ): 1.490 s ± 0.114 s [User: 0.006 s, System: 0.008 s]
Range (min … max): 1.327 s … 1.879 s 100 runs
Benchmark (msgspec.json.decode): curl http://localhost:8888/v1/buckets/my-app-development/collections/my-user/records
Time (mean ± σ): 1.267 s ± 0.052 s [User: 0.006 s, System: 0.007 s]
Range (min … max): 1.150 s … 1.428 s 100 runs
This improved the performance by ~18% for this collection (~3000 records).