planetscale/database-js

Performance difference

rcoundon opened this issue · 12 comments

I did a small performance comparison with the standard MySQL connection and the new driver in a lambda function. (using this)

I found that when using the serverless driver the cold start times and the execution times in a warm lambda take at least twice as long, often longer. For example, for a cold start with MySQL it's 3-4s, with the HTTP driver, it's more like 10s. Warm starts are 1s compared to 3 or 4s

I haven't yet added any logging of what's taking the additional time, but before I dig into things further I was curious to find out what others' experiences were in terms of performance?

👋 hey there, do you have any code you can share that reflects this?

I'm curious if the impact is purely on boot up time of the lambda and not runtime? Is that correct?

In practice, the actual query times and whatnot should be almost identical in performance if not actually a bit faster using the HTTP interface than mysql.

But if it's startup times purely, I wonder if we have any insight into what would cause that. I've not used Lambda myself so I'm not too familiar with their runtime.

Also, not that it should affect this too much, are you using the global connection endpoint? e.g. aws.connect.psdb.cloud as opposed to {region}.connect.psdb.cloud. It should definitely speed up initial connection times in both cases.

👋 hey there, do you have any code you can share that reflects this?

I'm curious if the impact is purely on boot up time of the lambda and not runtime? Is that correct?

In practice, the actual query times and whatnot should be almost identical in performance if not actually a bit faster using the HTTP interface than mysql.

But if it's startup times purely, I wonder if we have any insight into what would cause that. I've not used Lambda myself so I'm not too familiar with their runtime.

Thanks for getting back to me, it's both cold start and warm start, i.e. with and without a start-up. I also posted in the discussions area, so here's what I posted there today:

To help illustrate what's happening, here's the trace with the serverless driver
image

and here with mysql driver
image

As you can see, after the query returns in the mysql version, the next API calls to another service happen immediately, whereas with the serverless driver, there's a big time gap. (The post to psdb.v1alpha1... is the PlanetScale query - I'm sure that's obvious to you but just wanted to be specific!)

This smells like an event loop problem in our own code (we're using Node.js) but the only code that's changed is to use the driver and provide got-fetch as the HTTP client rather than mysql native.

I'll see if I can somehow separate out the code to make this happen into a minimal reproducible sample but it'll take me a while

Yeah, that's an interesting trace since it's mostly suggesting to me that the actual execution and response time of the query is faster with the HTTP driver by a significant margin. I guess figuring out what happens after the response comes back seems to be the big missing block of time.

I wish I had more to add here, but we haven't observed this at all, and I even use this in a personal project in Vercel workers.

Seems sorta like a Promise not being awaited on properly or something along those lines?

Oh, I just realized you're using this: https://www.npmjs.com/package/got-fetch as the fetch implementation. I wonder if there are some incompatibility issues there?

I'm pretty confident we've tested with https://www.npmjs.com/package/node-fetch and the builtin Fetch APIs when they're available.

Out of curiosity, why got-fetch instead? I'd be interested in seeing if you simply swapped it out.

Also if you're using recent Node, fetch is native and built in iirc.

I just put together a very small test for this and it seems fine with got-fetch outside of lambda:

index.js:

import 'process';
import {Client} from '@planetscale/database';

import gotFetch from 'got-fetch';
import nodeFetch from 'node-fetch';

const useNodeFetch = process.env.FETCH == 'node';

const fetch = useNodeFetch ? nodeFetch : gotFetch;

const client = new Client({
    url: process.env.DATABASE_URL,
    fetch,
});

const conn = client.connection();

const start = Date.now();
const results = await conn.execute('select 1 from dual');
console.log("execute time:", Date.now() - start, "ms");
console.log("fetch impl:", useNodeFetch ? "node-fetch" : "got-fetch");
console.log(results);

And when running this with FETCH=node and without, I get effectively identical runtimes nowhere near multiple seconds.

Inside the results we return, we track our own time as well in results.time. I'm curious if that reflects the multiple seconds of time or if it's the shorter ~300ms in your trace.

EDIT: this was with node v16.18.0.

package.json:

{
  "name": "xxx",
  "version": "1.0.0",
  "description": "",
  "type": "module",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "ISC",
  "volta": {
    "node": "16.18.0"
  },
  "dependencies": {
    "@planetscale/database": "^1.4.0",
    "got": "^12.5.2",
    "got-fetch": "^5.1.2",
    "node-fetch": "^3.2.10"
  }
}

Thanks for this - there was no specific reason for using got-fetch, I'll try node-fetch.
When accessing with a DATABASE_URL - what does that look like?
I've been using the host, username and password approach (which is what the console gives me if I select connect using @planetscale/database) but a URL is simpler.

Would I just replace mysql:// with https:// ?

Something like mysql://user:pw@host works sufficient. I too, also prefer using one variable here. We just support both forms. It doesn't need to be https, we just ignore the protocol part of the url so it more interops with other stuff. https is an implementation detail anyways that isn't necessarily relevant.

I'm now seeing something a bit weird, using a DATABASE_URL env var approach I constantly get unauthorised errors when using node-fetch. However, using a different HTTP client such as undici, it works correctly but takes a long time, as per the original issue.

My code looks like this

import { PlanetScaleDialect } from 'kysely-planetscale';
import { Kysely } from 'kysely';
// import fetch from 'node-fetch';
import { fetch } from 'undici';

import { log } from '@/utils/Log';
import { ResourceInventoryTableItem } from '@/transit/db/resourceInventoryTable/resourceInventoryMySql';
import { ActivityTableItem } from '@/transit/db/activityTable/ActivityTableMySql';
import {
  PartsCatalogItem,
  PartsCatalogPackagedVersionTable,
  PartsCatalogVersionItemTable,
} from '@/transit/db/partsCatalogTable/partsCatalogMySql';


interface Database {
  resource_inventory: ResourceInventoryTableItem;
  activities: ActivityTableItem;
  parts_catalog: PartsCatalogItem;
  parts_catalog_version: PartsCatalogVersionItemTable;
  parts_catalog_package_version: PartsCatalogPackagedVersionTable;
}

let db: Kysely<Database>;

export function createKysely() {
  if (!db) {
    db = new Kysely<Database>({
      dialect: new PlanetScaleDialect({
        url: process.env.DATABASE_URL,
        fetch,
      }),
      log(event): void {
        if (event.level === 'query') {
          log.debug(event.query.sql);
          log.debug(JSON.stringify(event.query.parameters));
        } else if (event.level === 'error') {
          log.error(JSON.stringify(event.error));
        }
      },
    });
  }
  return db;
}

db = createKysely();

I guess one difference is that I'm having to go through they kysely-planetscale adapter but this is a very thin wrapper and looking at the code in that repo, I can't see why node-fetch would make a difference. I'm stumped but I need to sleep now! I'll take another look tomorrow, if you have any thoughts in the meantime, they'd be gratefully received!

@rcoundon I'm genuinely not too sure if there's more I can assist with helping here. There's a decent amount of indirection here and I'm inclined to think it's something misbehaving in your code or in the kysely-planetscale library that I've not looked at.

I find it hard to believe this is a Lambda issue in itself, and I'd probably suggest distilling this down to attempting to just run a select 1 yourself with our library directly to see if that's any issue.

We've definitely put together demos and whatnot in Lambda and had no issues, but we haven't used kysely-planetscale or any other abstractions with database-js, so I'm unsure where things can be falling apart.

On top of that, I am not the most proficient at JavaScript and TypeScript. I worked on and implemented the server side APIs and all of that. So I know how the APIs work and all that, but I'm a bit of a noob when it comes to debugging JS runtimes.

I totally understand, and thank you for your time and help thus far. I'm still digging, I'll let you know what I find!

tbarn commented

Hey @rcoundon, I'm going to close this issue for now, but if you have more to share in the future we are happy to take a look again.