recap-build/recap

Support SQLite converter into Recap

Closed this issue ยท 14 comments

SQLite schematab docs:

Every SQLite database contains a single "schema table" that stores the schema for that database. The schema for a database is a description of all of the other tables, indexes, triggers, and views that are contained within the database. The schema table looks like this:

CREATE TABLE sqlite_schema(
  type text,
  name text,
  tbl_name text,
  rootpage integer,
  sql text
);

The sql field is the SQL DDL that would create the table.

To implement this, one would create a recap SQLite client to grab that DDL and then create a recap SQLite converter to turn it into recap types.

Oh, gosh, I can't believe I didn't think of adding this! ๐Ÿคฃ Seems so obvious now that you opened the issue. ๐Ÿ˜

Using the DDL to represent the schema is an interesting (lazy?) choice for SQLite. But thinking about it a bit more... if we implement that for SQLite, we might be able to reuse that code for pulling schema from other databases assuming the DDL dumps are compatible. Potentially replace postgresql and mysql client/converter code using that.

Perhaps sqlglot can be of use?

assuming the DDL dumps are compatible

I think this is the key part of your statement. I suspect information_schema will be more robust. Perhaps start with SQLite impl only, and see where that leads...

Poked around a bit. Looks like there are a few options:

https://www.sqlitetutorial.net/sqlite-describe-table/

The pragma option looks pretty parsable.

@mjperrone do y'all want to take this or should I? I don't mind doing it, but I don't want to start if y'all want to take it for learning experience or whatever. =-)

We aren't in a rush to implement this one, so I dont think we will get to it before you do if you're interested in it

Nice! I've been hankering to write a bit of code, so I'll take a crack at this.

Starting to poke at this now.

@mjperrone been doing some digging on SQLite data types.

  1. SQLite supports two modes: STRICT or default. Tables created without STRICT use type affinities, which are recommendations that SQLite ignores.
  2. sqlite2 and sqlite3 differ
  3. In non-STRICT tables, everything in () is ignored (e.g. INTEGER(123) is just INTEGER to SQLite). The parenthesis are still stored in the pragma description, so it's accessible.
  4. For non-STRICT tables, SQLite uses some kind of wacky affinity matching rules.

Here's what I recommend:

  1. Make the converter work with STRICT tables only.
  2. Make the converter work with sqlite3.

This is a very straight-forward to implement. Is this OK with you?

Alternatively, we could add support for both STRICT and non-STRICT tables. This gets a lot messier. Dates, for example, are a mess in SQLite; it accepts both strings or ints in the column. ๐Ÿ˜ข

WDYT? What is your use case here? Do you have any practical examples of SQLite schemas you're dealing with?

/cc @adrianisk

Some more details on affinity matching:

https://medium.com/@SullivanArielle/type-affinity-5936cee17c35

Note that there are two things happening here:

  1. The type of the column
  2. Coercing values

We're only interested in column types (1). As such, after further reading, I think the converter can handle both affinity and STRICT types. I'll include a flag similar to the PG converter's enforce_array_dimensions flag to specify whether the converter should return STRICT RecapTypes (e.g. all integers get treated as IntType(bits=64)) or affinity types (e.g. a TINYINT would return a BoolType (or maybe IntType(bits=1, variable=False)).

I also plan on implementing scale/precision support and char octet length. They'll only be used when use_affinity_rules==True.

@criccomini I'll try to get some more details for you.

@mjperrone I have a PR up here:

#424

It's still WIP for tests, but I want feedback. I'm going to add CLI tests for recap schema and recap ls and do some minor tweaks.

I may be able to give this a look on Monday