zk-SQL
This project provides basic tools and infrastructure for proving the correctness of the SQL queries performed against the database hosted in an untrusted environment.
As a demo, this project implements a platform that connects "data miners" running SQL-prover nodes and users that wish to outsource their data and are willing to pay for such service.
Other use cases of the zk-SQL include running it as an alternative backend for dApps frontends with no additional trust assumptions, on-chain Oracles serving on-demand data to smart contracts in a completely verifiable way, and a "pay-as-you-go" model for trustless private data marketplaces.
How it works
The integral part of this project is a suite of arithmetic circuits that encode main SQL operations using iden3/circom language. The select.circom
represents the primary read operation for querying certain data from a certain table with a certain condition. The write operations include insert.circom
, update.circom
, delete.circom
.
As private inputs, these circuits commonly take a table
represented as nRows
x nCols
matrix, a nCols
size header
. Parameters fields
and setFields
are vectors used in SELECT
and UPDATE
queries and are expected to always be a subset of header
. Whenever query has and WHERE
clause circuit will take whereConditions
- a matrix where each row is a set of AND
-separated expressions and each row itself is separated with OR
operator, and each cell is a (column, value)
tuple represented a binary equality expression (more conditions are coming).
As public parameters, all circuits take tableCommit
which is a hash of the table
along with the header
calculated during the last write operation and is stored in the zkSQL.sol
contract. The argsCommit
parameter is used similarly for committing to SQL query parameters allowing keeping them private too.
Each circuit will verify the integrity of the private data first by hashing it and comparing it with tableCommit
and only then simulate operations on the data as prescribed by the SQL query and the underlying parameters. The read operation will output the results
matrix repressing the selected data. The write operations will instead output the newTableCommit
, which is the hash of the table after it was modified by the query. All output signals are public.
The verifier (either user for select query or smart contract for write operations) would use public inputs and the generated proof to ensure the correctness of the performed query without having to own the full table data.
Repository structure
circuits
: contains previously described circuits along withutil.circom
andhashTable.circom
for the common intermediary templets.contracts
: contains autogenerated*Verifier.sol
contracts for each SQL operation encoded using PLONK proving system.lib
: source of the SQL-prover toolsnode
: node software running SQL-prover engine and public-facing REST interface based on the Ts.ED framework.ui
: frontend application using SSR based on the Next.js framework.
Instructions
Dependencies:
- Circom 2.0
- Node.js, yarn
- make
Environmental variables
Go to each package directories [lib
, node
, ui
] and create .env
file (see example .env.example
)
cd nano .env
Install dependencies:
yarn install
Build circuits and contracts:
make build
Run node (dev)
yarn start-node
Run ui (dev)
yarn start-ui
Limitations (Mainnet)
- As currently compiled circuits support 10 rows, 5 columns, 5
AND
expessions, 2OR
expessions. - Operators
>
,<
,>=
,<=
inWHERE
expression are possible but disabled to reduce latency.