bug: `BLOB` / `Buffer` input parameters result in corrupted data
jkomyno opened this issue · 0 comments
Hey again 👋🏻, Alberto from @prisma here.
While working on our new Driver Adapters feature, we noticed that @planetscale/database
doesn't handle Bytes
fields correctly, when compared to other JavaScript drivers. In particular, the binary values we insert do not always match the values retrieved by @planetscale/database
(see https://github.com/prisma/team-orm/issues/687). Here is a TypeScript reproduction.
Scaffolding
Consider a PlanetScale database with the following table,
CREATE TABLE `binary_test` (
`id` varchar(191) NOT NULL,
`bytes` longblob NOT NULL,
PRIMARY KEY (`id`)
) ENGINE InnoDB,
CHARSET utf8mb4,
COLLATE utf8mb4_unicode_ci;
generated from this Prisma model:
model binary_test {
id String @id
bytes Bytes
}
Let us also assume that the connection string to such PlanetScale database is available in the DATABASE_URL
environment variable.
The problem
Let's say we want to store an input string in binary_test
.bytes
, a BLOB
field.
To do so, we:
- Wrap the string in a Node.js
Buffer
, and applybase64
encoding. - Pass the buffer as a query parameter to
@planetscale/database
. - Query the record we've just created.
- Read the raw
bytes
column value - Place this value into a
Buffer
- Apply the
base64
decoding to theBuffer
Here's an example with the input string FSDF
import { Client } from '@planetscale/database'
const connectionString = `${process.env.DATABASE_URL}`
async function mainPlanetscale() {
const client = new Client({ url: connectionString })
const input = 'FSDF'
console.log('input', input)
const inputAsBuffer = Buffer.from(input, 'base64')
console.log('inputAsBuffer', inputAsBuffer)
await client.execute('DELETE FROM `binary_test`')
await client.execute('INSERT INTO `binary_test` (`id`, `bytes`) VALUES (1, ?)', [inputAsBuffer])
const result = await client.execute('SELECT `id`, `bytes` FROM `binary_test`')
const outputRaw = result.rows[0]['bytes']
console.log('outputRaw', outputRaw)
console.log('typeof outputRaw', typeof outputRaw)
const outputAsBuffer = Buffer.from(outputRaw)
console.log('outputAsBuffer', outputAsBuffer)
const output = outputAsBuffer.toString('base64')
console.log('output', output)
console.log('`input === output`', input === output)
}
mainPlanetscale().catch((e) => {
console.error(e)
process.exit(1)
})
which yields the following to stdout:
input FSDF
inputAsBuffer <Buffer 15 20 c5>
outputRaw �
typeof outputRaw string
outputAsBuffer <Buffer 15 20 c3 af c2 bf c2 bd>
output FSDDr8K/wr0=
`input === output` false
Notice the discrepancy between the first and the second-last line, which doesn't match our expectations:
- FSDF
+ FSDDr8K/wr0=
It is interesting to observe that only the first 2 bytes of the input are preserved in the output.
Let's compare this to the mysql2
driver, run against the same PlanetScale instance:
// install via `npm i -S mysql2`
import mysql from 'mysql2/promise'
const connectionString = `${process.env.DATABASE_URL}`
async function mainMysql2() {
const client = await mysql.createConnection({
uri: connectionString,
ssl: {
verifyIdentity: false,
rejectUnauthorized: false,
}
})
const input = 'FSDF'
console.log('input', input)
const inputAsBuffer = Buffer.from(input, 'base64')
console.log('inputAsBuffer', inputAsBuffer)
await client.execute('DELETE FROM `binary_test`')
await client.execute('INSERT INTO `binary_test` (`id`, `bytes`) VALUES (1, ?)', [inputAsBuffer])
const result = await client.query('SELECT `id`, `bytes` FROM `binary_test`')
const outputRaw = result[0][0]['bytes']
console.log('outputRaw', outputRaw)
console.log('typeof outputRaw', typeof outputRaw)
const outputAsBuffer = Buffer.from(outputRaw)
console.log('outputAsBuffer', outputAsBuffer)
const output = outputAsBuffer.toString('base64')
console.log('output', output)
console.log('`input === output`', input === output)
await client.end()
}
mainMysql2().catch((e) => {
console.error(e)
process.exit(1)
})
which yields the following to stdout:
input FSDF
inputAsBuffer <Buffer 15 20 c5>
outputRaw <Buffer 15 20 c5>
typeof outputRaw object
outputAsBuffer <Buffer 15 20 c5>
output FSDF
`input === output` true
In particular, we see that input
(first line) matches output
(second-last line) when using mysql2
,.
Does this @planetscale/database-js
byte discrepancy problem occur for every byte input? Not really. For instance, let's run mainPlanetScale
with the input string AQID
:
input AQID
inputAsBuffer <Buffer 01 02 03>
outputAsString
typeof outputAsString string
outputAsBuffer <Buffer 01 02 03>
output AQID
`input === output` true