A fast and performant SQLite extension for regular expressions. Based on sqlite-loadable-rs
, and the regex crate.
See Introducing sqlite-regex: The fastest Regular Expression Extension for SQLite (Jan 2023) for more details!
If your company or organization finds this library useful, consider supporting my work!
.load ./regex0
select 'foo' regexp 'f';
Find all occurrences of a pattern in a string
select regex_find(
'[0-9]{3}-[0-9]{3}-[0-9]{4}',
'phone: 111-222-3333'
);
-- '111-222-3333'
select rowid, *
from regex_find_all(
'\b\w{13}\b',
'Retroactively relinquishing remunerations is reprehensible.'
);
/*
┌───────┬───────┬─────┬───────────────┐
│ rowid │ start │ end │ match │
├───────┼───────┼─────┼───────────────┤
│ 0 │ 0 │ 13 │ Retroactively │
│ 1 │ 14 │ 27 │ relinquishing │
│ 2 │ 28 │ 41 │ remunerations │
│ 3 │ 45 │ 58 │ reprehensible │
└───────┴───────┴─────┴───────────────┘
*/
Extract capture group values by index or name
select
regex_capture(captures, 0) as entire_match,
regex_capture(captures, 'title') as title,
regex_capture(captures, 'year') as year
from regex_captures(
regex("'(?P<title>[^']+)'\s+\((?P<year>\d{4})\)"),
"'Citizen Kane' (1941), 'The Wizard of Oz' (1939), 'M' (1931)."
);
/*
┌───────────────────────────┬──────────────────┬──────┐
│ entire_match │ title │ year │
├───────────────────────────┼──────────────────┼──────┤
│ 'Citizen Kane' (1941) │ Citizen Kane │ 1941 │
│ 'The Wizard of Oz' (1939) │ The Wizard of Oz │ 1939 │
│ 'M' (1931) │ M │ 1931 │
└───────────────────────────┴──────────────────┴──────┘
*/
Use RegexSets to match a string on multiple patterns in linear time
select regexset_is_match(
regexset(
"bar",
"foo",
"barfoo"
),
'foobar'
)
Split the string on the given pattern delimiter
select rowid, *
from regex_split('[ \t]+', 'a b c d e');
/*
┌───────┬──────┐
│ rowid │ item │
├───────┼──────┤
│ 0 │ a │
│ 1 │ b │
│ 2 │ c │
│ 3 │ d │
│ 4 │ e │
└───────┴──────┘
*/
Replace occurrences of a pattern with another string
select regex_replace(
'(?P<last>[^,\s]+),\s+(?P<first>\S+)',
'Springsteen, Bruce',
'$first $last'
);
-- 'Bruce Springsteen'
select regex_replace_all('a', 'abc abc', '');
-- 'bc bc'
See docs.md
for a full API reference.
Language | Install | |
---|---|---|
Python | pip install sqlite-regex |
|
Datasette | datasette install datasette-sqlite-regex |
|
Node.js | npm install sqlite-regex |
|
Deno | deno.land/x/sqlite_regex |
|
Ruby | gem install sqlite-regex |
|
Github Release | ||
Rust | cargo add sqlite-regex |
The Releases page contains pre-built binaries for Linux x86_64, MacOS, and Windows.
If you want to use sqlite-regex
as a Runtime-loadable extension, Download the regex0.dylib
(for MacOS), regex0.so
(Linux), or regex0.dll
(Windows) file from a release and load it into your SQLite environment.
Note: The
0
in the filename (regex0.dylib
/regex0.so
/regex0.dll
) denotes the major version ofsqlite-regex
. Currentlysqlite-regex
is pre v1, so expect breaking changes in future versions.
For example, if you are using the SQLite CLI, you can load the library like so:
.load ./regex0
select regex_version();
-- v0.1.0
Or in Python, using the builtin sqlite3 module:
import sqlite3
con = sqlite3.connect(":memory:")
con.enable_load_extension(True)
con.load_extension("./regex0")
print(con.execute("select regex_version()").fetchone())
# ('v0.1.0',)
Or in Node.js using better-sqlite3:
const Database = require("better-sqlite3");
const db = new Database(":memory:");
db.loadExtension("./regex0");
console.log(db.prepare("select regex_version()").get());
// { 'regex_version()': 'v0.1.0' }
Or with Datasette:
datasette data.db --load-extension ./regex0
I (Alex 👋🏼) spent a lot of time and energy on this project and many other open source projects. If your company or organization uses this library (or you're feeling generous), then please consider supporting my work, or share this project with a friend!
- sqlite-xsv, A SQLite extension for working with CSVs
- sqlite-loadable, A framework for writing SQLite extensions in Rust
- sqlite-http, A SQLite extension for making HTTP requests