planetscale/database-js

bug: `BLOB` / `Buffer` input parameters result in corrupted data

jkomyno opened this issue · 0 comments

Hey again 👋🏻, Alberto from @prisma here.

While working on our new Driver Adapters feature, we noticed that @planetscale/database doesn't handle Bytes fields correctly, when compared to other JavaScript drivers. In particular, the binary values we insert do not always match the values retrieved by @planetscale/database (see https://github.com/prisma/team-orm/issues/687). Here is a TypeScript reproduction.

Scaffolding

Consider a PlanetScale database with the following table,

CREATE TABLE `binary_test` (
  `id` varchar(191) NOT NULL,
  `bytes` longblob NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE InnoDB,
  CHARSET utf8mb4,
  COLLATE utf8mb4_unicode_ci;

generated from this Prisma model:

model binary_test {
  id String @id
  bytes Bytes
}

Let us also assume that the connection string to such PlanetScale database is available in the DATABASE_URL environment variable.

The problem

Let's say we want to store an input string in binary_test.bytes, a BLOB field.
To do so, we:

  1. Wrap the string in a Node.js Buffer, and apply base64 encoding.
  2. Pass the buffer as a query parameter to @planetscale/database.
  3. Query the record we've just created.
  4. Read the raw bytes column value
  5. Place this value into a Buffer
  6. Apply the base64 decoding to the Buffer

Here's an example with the input string FSDF

import { Client } from '@planetscale/database'

const connectionString = `${process.env.DATABASE_URL}`

async function mainPlanetscale() {
  const client = new Client({ url: connectionString })
  const input = 'FSDF'
  console.log('input', input)

  const inputAsBuffer = Buffer.from(input, 'base64')
  console.log('inputAsBuffer', inputAsBuffer)

  await client.execute('DELETE FROM `binary_test`')
  await client.execute('INSERT INTO `binary_test` (`id`, `bytes`) VALUES (1, ?)', [inputAsBuffer])
  const result = await client.execute('SELECT `id`, `bytes` FROM `binary_test`')

  const outputRaw = result.rows[0]['bytes']
  console.log('outputRaw', outputRaw)
  console.log('typeof outputRaw', typeof outputRaw)

  const outputAsBuffer = Buffer.from(outputRaw)
  console.log('outputAsBuffer', outputAsBuffer)

  const output = outputAsBuffer.toString('base64')
  console.log('output', output)
  console.log('`input === output`', input === output)
}

mainPlanetscale().catch((e) => {
  console.error(e)
  process.exit(1)
})

which yields the following to stdout:

input FSDF
inputAsBuffer <Buffer 15 20 c5>
outputRaw  �
typeof outputRaw string
outputAsBuffer <Buffer 15 20 c3 af c2 bf c2 bd>
output FSDDr8K/wr0=
`input === output` false

Notice the discrepancy between the first and the second-last line, which doesn't match our expectations:

- FSDF
+ FSDDr8K/wr0=

It is interesting to observe that only the first 2 bytes of the input are preserved in the output.


Let's compare this to the mysql2 driver, run against the same PlanetScale instance:

// install via `npm i -S mysql2`
import mysql from 'mysql2/promise'

const connectionString = `${process.env.DATABASE_URL}`

async function mainMysql2() {
  const client = await mysql.createConnection({
    uri: connectionString,
    ssl: {
      verifyIdentity: false,
      rejectUnauthorized: false,
    }
  })

  const input = 'FSDF'
  console.log('input', input)

  const inputAsBuffer = Buffer.from(input, 'base64')
  console.log('inputAsBuffer', inputAsBuffer)

  await client.execute('DELETE FROM `binary_test`')
  await client.execute('INSERT INTO `binary_test` (`id`, `bytes`) VALUES (1, ?)', [inputAsBuffer])
  const result = await client.query('SELECT `id`, `bytes` FROM `binary_test`')

  const outputRaw = result[0][0]['bytes']
  console.log('outputRaw', outputRaw)
  console.log('typeof outputRaw', typeof outputRaw)

  const outputAsBuffer = Buffer.from(outputRaw)
  console.log('outputAsBuffer', outputAsBuffer)

  const output = outputAsBuffer.toString('base64')
  console.log('output', output)
  console.log('`input === output`', input === output)

  await client.end()
}

mainMysql2().catch((e) => {
  console.error(e)
  process.exit(1)
})

which yields the following to stdout:

input FSDF
inputAsBuffer <Buffer 15 20 c5>
outputRaw <Buffer 15 20 c5>
typeof outputRaw object
outputAsBuffer <Buffer 15 20 c5>
output FSDF
`input === output` true

In particular, we see that input (first line) matches output (second-last line) when using mysql2,.


Does this @planetscale/database-js byte discrepancy problem occur for every byte input? Not really. For instance, let's run mainPlanetScale with the input string AQID:

input AQID
inputAsBuffer <Buffer 01 02 03>
outputAsString 
typeof outputAsString string
outputAsBuffer <Buffer 01 02 03>
output AQID
`input === output` true