/fastlite

A bit of extra usability for sqlite

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

fastlite

fastlite provides some little quality-of-life improvements for interactive use of the wonderful sqlite-utils library. It’s likely to be particularly of interest to folks using Jupyter.

Install

pip install fastlite

Overview

from sqlite_utils import Database
from fastlite import *
from fastcore.utils import *

We demonstrate fastlite‘s features here using the ’chinook’ sample database.

url = 'https://github.com/lerocha/chinook-database/raw/master/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite'
path = Path('chinook.sqlite')
if not path.exists(): urlsave(url, path)

db = Database("chinook.sqlite")

Databases have a t property that lists all tables:

dt = db.t
dt
Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track

You can use this to grab a single table…:

artist = dt.Artist
artist
<Table Artist (ArtistId, Name)>

…or multiple tables at once:

dt['Artist','Album','Track','Genre','MediaType']
[<Table Artist (ArtistId, Name)>,
 <Table Album (AlbumId, Title, ArtistId)>,
 <Table Track (TrackId, Name, AlbumId, MediaTypeId, GenreId, Composer, Milliseconds, Bytes, UnitPrice)>,
 <Table Genre (GenreId, Name)>,
 <Table MediaType (MediaTypeId, Name)>]

It also provides auto-complete in Jupyter, IPython, and nearly any other interactive Python environment:

Column work in a similar way to tables, using the c property:

ac = artist.c
ac
ArtistId, Name

Auto-complete works for columns too:

Columns, tables, and view stringify in a format suitable for including in SQL statements. That means you can use auto-complete in f-strings.

qry = f"select * from {artist} where {ac.Name} like 'AC/%'"
print(qry)
select * from "Artist" where "Artist"."Name" like 'AC/%'

You can view the results of a select query using q:

db.q(qry)
[{'ArtistId': 1, 'Name': 'AC/DC'}]

Views can be accessed through the v property:

album = dt.Album

acca_sql = f"""select {album}.*
from {album} join {artist} using (ArtistId)
where {ac.Name} like 'AC/%'"""

db.create_view("AccaDaccaAlbums", acca_sql, replace=True)
acca_dacca = db.q(f"select * from {db.v.AccaDaccaAlbums}")
acca_dacca
[{'AlbumId': 1,
  'Title': 'For Those About To Rock We Salute You',
  'ArtistId': 1},
 {'AlbumId': 4, 'Title': 'Let There Be Rock', 'ArtistId': 1}]

A dataclass type with the names, types, and defaults of the tables is created using dataclass():

album_dc = album.dataclass()

Let’s try it:

album_obj = album_dc(**acca_dacca[0])
album_obj
Album_cls(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1)

You can get the definition of the dataclass using fastcore’s dataclass_src – everything is treated as nullable, in order to handle auto-generated database values:

src = dataclass_src(album_dc)
hl_md(src, 'python')
@dataclass
class Album_cls:
    AlbumId: int | None = None
    Title: str | None = None
    ArtistId: int | None = None

There’s also a shortcut to select from a table – just call it as a function. If you’ve previously called dataclass(), returned iterms will be constructed using that class by default. There’s lots of params you can check out, such as limit:

album(limit=2)
[Album_cls(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1),
 Album_cls(AlbumId=2, Title='Balls to the Wall', ArtistId=2)]

Pass a truthy value as the first param and you’ll get tuples of primary keys and records:

album(1, limit=2)
[(1,
  Album_cls(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1)),
 (2, Album_cls(AlbumId=2, Title='Balls to the Wall', ArtistId=2))]

get also uses the dataclass by default:

album.get(1)
Album_cls(AlbumId=1, Title='For Those About To Rock We Salute You', ArtistId=1)

KW args

If you import from fastlite.kw then the following methods accept **kwargs, passing them along to the first dict param:

  • create
  • transform
  • transform_sql
  • update
  • insert
  • upsert
  • lookup
from fastlite.kw import *

Without the above import, create would require a dict param, but here we just pass keyword args directly:

cats = dt.cats
cats.create(id=int, name=str, weight=float, pk='id')
hl_md(cats.schema, 'sql')
CREATE TABLE [cats] (
   [id] INTEGER PRIMARY KEY,
   [name] TEXT,
   [weight] FLOAT
)

…the same applies to insert here:

cats.insert(name='meow', weight=6)
(idx,cat),*_ = cats(1)
idx,cat
(1, {'id': 1, 'name': 'meow', 'weight': 6.0})

Using ** in upsert here doesn’t actually achieve anything, since we can just pass a dict directly – it’s just to show that it works:

cat['name'] = "moo"
cats.upsert(**cat)
cats()
[{'id': 1, 'name': 'moo', 'weight': 6.0}]
cats.drop()
cats
<Table cats (does not exist yet)>

Diagrams

If you have graphviz installed, you can create database diagrams:

diagram(db.tables)

Pass a subset of columns to just diagram those. You can also adjust the size and aspect ratio.

diagram(db.t['Artist','Album','Track','Genre','MediaType'], size=8, ratio=0.4)