planetscale/database-js

INVISIBLE columns showing up with SELECT *

cdcarson opened this issue · 6 comments

I noticed the other day that SELECT * queries were coming back with columns I've marked as INVISIBLE included. I'm sure this was not the case the last time I paid attention, a few months ago. That is, INVISIBLE columns were omitted. I can do without it, but INVISIBLE is a convenient way to exclude stuff based on security concerns or optimization.

This might not be the right place to file this issue -- the behavior is the same using a mysql driver -- but this is the library I'm using. Maybe related upstream: vitessio/vitess#9783.

Reproduction:

CREATE TABLE `UserLogin` (
  `userId` bigint unsigned NOT NULL,
  `email` varchar(320) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,
  `password` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL /*!80023 INVISIBLE */,
  PRIMARY KEY (`userId`),
  UNIQUE KEY `UserLogin_email_key` (`email`),
  KEY `UserLogin_userId_idx` (`userId`)
);

SHOW COLUMNS FROM UserLogin;
+----------+-----------------+------+-----+---------+-----------+
| Field    | Type            | Null | Key | Default | Extra     |
+----------+-----------------+------+-----+---------+-----------+
| userId   | bigint unsigned | NO   | PRI | NULL    |           |
| email    | varchar(320)    | NO   | UNI | NULL    |           |
| password | varchar(191)    | NO   |     | NULL    | INVISIBLE |
+----------+-----------------+------+-----+---------+-----------+

INSERT INTO UserLogin (`userId`, `email`, `password`) VALUES (1, 'foo@bar.com', 'hashed IRL');
SELECT * FROM UserLogin WHERE userId = 1;
+--------+-------------+------------+
| userId | email       | password   |
+--------+-------------+------------+
|      1 | foo@bar.com | hashed IRL |
+--------+-------------+------------+

Interesting, personally, did not know this was a thing.

So yeah, to your point, this does the same behavior over the normal mysql client too.

For example:

mysql> select * from UserLogin;
+--------+-------------+------------+
| userId | email       | password   |
+--------+-------------+------------+
|      1 | foo@bar.com | hashed IRL |
+--------+-------------+------------+
1 row in set (0.10 sec)

Since this has piqued my interest, I'm going to investigate this a bit. Each column is transferred with a bitfield of flags.

So this information isn't lost, it's contained within the bitfield as this bit: https://dev.mysql.com/doc/dev/mysql-server/latest/group__group__cs__column__definition__flags.html#gaccd91feb56c3b9147458bd6f0686eb5a

Now, we don't do anything to say, explicitly strip these fields based on this, so I'm curious who is responsible for dealing with this, since it seems the MySQL CLI also doesn't ignore them. I'm curious vitess is supposed to omit these and it's just errornously giving us all of the information back, or if clients are supposed to choose what to do with this.

Even with Python, it returns all columns:

$ python
>>> import MySQLdb
>>> db = MySQLdb.connect(host='aws.connect.psdb.cloud', port=3306, user='xxx', password='xxx')
>>> c = db.cursor()
>>> c.execute('select * from UserLogin')
1
>>> c.fetchone()
(1, 'foo@bar.com', 'hashed IRL')

So I strongly suspect if there's any behavior wrong, it's on the vitess side.

Ok, and behavior against vanilla mysql 8.0.29, it behaves how you're suggesting.

mysql> SELECT * FROM UserLogin WHERE userId = 1;
+--------+-------------+
| userId | email       |
+--------+-------------+
|      1 | foo@bar.com |
+--------+-------------+
1 row in set (0.01 sec)

@mattrobenolt Yeah, it's definitely upstream from this library. On the off-chance it wasn't, I tried downgrading to @planetscale/database to 1.8 and 1.6, versions where I know that INVISIBLE was working as expected at the time, and same deal -- the columns are included. So it's something with Vitess or PlanetScale's implementation.

As I said, I can work around this. But if it can't be fixed maybe the MySQL compatibility page can be updated.

Thanks!

Vitess currently doesn't support invisible columns for query planning, which is why this doesn't work. That it worked before is more kinda accidental of how the PSDB environment is set up, so we didn't really know this would regress.

vitessio/vitess#14367 tracks the feature request on the Vitess side and vitessio/vitess#14366 implements it. We might end up depending though on completing a rollout to a new Vitess version for this internally then, but we'll see if we can safely backport this or not.

I just noticed this seems to be fixed at least on my PlanetScale database, i.e., INVISIBLE columns are omitted with SELECT *.

Is it fixed or am I jumping the gun? Thanks.