Tool for running transformations on columns in a SQLite database.
⚠️ This tool is no longer maintainedI added a new tool to sqlite-utils called sqlite-utils convert which provides a super-set of the functionality originally provided here.
sqlite-transform
is no longer maintained, and I recommend switching to usingsqlite-utils convert
instead.
pip install sqlite-transform
These subcommands will run all values in the specified column through dateutils.parser.parse()
and replace them with the result, formatted as an ISO timestamp or ISO date.
For example, if a row in the database has an opened
column which contains 10/10/2019 08:10:00 PM
, running the following command:
sqlite-transform parsedatetime my.db mytable opened
Will result in that value being replaced by 2019-10-10T20:10:00
.
Using the parsedate
subcommand here would result in 2019-10-10
instead.
In the case of ambiguous dates such as 03/04/05
these commands both default to assuming American-style mm/dd/yy
format. You can pass --dayfirst
to specify that the day should be assumed to be first, or --yearfirst
for the year.
The jsonsplit
subcommand takes columns that contain a comma-separated list, for example a tags
column containing records like "trees,park,dogs"
and converts it into a JSON array ["trees", "park", "dogs"]
.
This is useful for taking advantage of Datasette's Facet by JSON array feature.
sqlite-transform jsonsplit my.db mytable tags
It defaults to splitting on commas, but you can specify a different delimiter character using the --delimiter
option, for example:
sqlite-transform jsonsplit \
my.db mytable tags --delimiter ';'
Values within the array will be treated as strings, so a column containing 123,552,775
will be converted into the JSON array ["123", "552", "775"]
.
You can specify a different type for these values using --type int
or --type float
, for example:
sqlite-transform jsonsplit \
my.db mytable tags --type int
This will result in that column being converted into [123, 552, 775]
.
The lambda
subcommand lets you specify Python code which will be executed against the column.
Here's how to convert a column to uppercase:
sqlite-transform lambda my.db mytable mycolumn --code='str(value).upper()'
The code you provide will be compiled into a function that takes value
as a single argument. You can break your function body into multiple lines, provided the last line is a return
statement:
sqlite-transform lambda my.db mytable mycolumn --code='value = str(value)
return value.upper()'
You can also specify Python modules that should be imported and made available to your code using one or more --import
options:
sqlite-transform lambda my.db mytable mycolumn \
--code='"\n".join(textwrap.wrap(value, 10))' \
--import=textwrap
The --dry-run
option will output a preview of the transformation against the first ten rows, without modifying the database.
Each of these commands accepts optional --output
and --output-type
options. These can be used to save the result of the transformation to a separate column, which will be created if the column does not already exist.
To save the result of jsonsplit
to a new column called json_tags
, use the following:
sqlite-transform jsonsplit my.db mytable tags \
--output json_tags
The type of the created column defaults to text
, but a different column type can be specified using --output-type
. This example will create a new floating point column called float_id
with a copy of each item's ID increased by 0.5:
sqlite-transform lambda my.db mytable id \
--code 'float(value) + 0.5' \
--output float_id \
--output-type float
You can drop the original column at the end of the operation by adding --drop
.
Sometimes you may wish to convert a single column into multiple derived columns. For example, you may have a location
column containing latitude,longitude
values which you wish to split out into separate latitude
and longitude
columns.
You can achieve this using the --multi
option to sqlite-transform lambda
. This option expects your --code
function to return a Python dictionary: new columns well be created and populated for each of the keys in that dictionary.
For the latitude,longitude
example you would use the following:
sqlite-transform lambda demo.db places location \
--code 'return {
"latitude": float(value.split(",")[0]),
"longitude": float(value.split(",")[1]),
}' --multi
The type of the returned values will be taken into account when creating the new columns. In this example, the resulting database schema will look like this:
CREATE TABLE [places] (
[location] TEXT,
[latitude] FLOAT,
[longitude] FLOAT
);
The code function can also return None
, in which case its output will be ignored.
You can drop the original column at the end of the operation by adding --drop
.
By default each command will show a progress bar. Pass -s
or --silent
to hide that progress bar.