/zk-SQL

ZK-based engine for self-sovereign SQL queries

Primary LanguageSolidityMIT LicenseMIT

zk-SQL

This project provides basic tools and infrastructure for proving the correctness of the SQL queries performed against the database hosted in an untrusted environment.

As a demo, this project implements a platform that connects "data miners" running SQL-prover nodes and users that wish to outsource their data and are willing to pay for such service.

Other use cases of the zk-SQL include running it as an alternative backend for dApps frontends with no additional trust assumptions, on-chain Oracles serving on-demand data to smart contracts in a completely verifiable way, and a "pay-as-you-go" model for trustless private data marketplaces.

How it works

The integral part of this project is a suite of arithmetic circuits that encode main SQL operations using iden3/circom language. The select.circom represents the primary read operation for querying certain data from a certain table with a certain condition. The write operations include insert.circom, update.circom, delete.circom.

As private inputs, these circuits commonly take a table represented as nRows x nCols matrix, a nCols size header. Parameters fields and setFields are vectors used in SELECT and UPDATE queries and are expected to always be a subset of header. Whenever query has and WHERE clause circuit will take whereConditions - a matrix where each row is a set of AND-separated expressions and each row itself is separated with OR operator, and each cell is a (column, value) tuple represented a binary equality expression (more conditions are coming).

As public parameters, all circuits take tableCommit which is a hash of the table along with the header calculated during the last write operation and is stored in the zkSQL.sol contract. The argsCommit parameter is used similarly for committing to SQL query parameters allowing keeping them private too.

Each circuit will verify the integrity of the private data first by hashing it and comparing it with tableCommit and only then simulate operations on the data as prescribed by the SQL query and the underlying parameters. The read operation will output the results matrix repressing the selected data. The write operations will instead output the newTableCommit, which is the hash of the table after it was modified by the query. All output signals are public.

The verifier (either user for select query or smart contract for write operations) would use public inputs and the generated proof to ensure the correctness of the performed query without having to own the full table data.

Repository structure

  • circuits: contains previously described circuits along with util.circom and hashTable.circom for the common intermediary templets.
  • contracts: contains autogenerated *Verifier.sol contracts for each SQL operation encoded using PLONK proving system.
  • lib: source of the SQL-prover tools
    • engine: tools for bridging embedded sql.js database with snarkjs for operating data tables while producing ZK proofs.
    • client: client for the API and other tools for client-side proof verification.
  • node: node software running SQL-prover engine and public-facing REST interface based on the Ts.ED framework.
  • ui: frontend application using SSR based on the Next.js framework.

Instructions

Dependencies:

  • Circom 2.0
  • Node.js, yarn
  • make

Environmental variables

Go to each package directories [lib, node, ui] and create .env file (see example .env.example)

cd nano .env

Install dependencies:

yarn install

Build circuits and contracts:

make build

Run node (dev)

yarn start-node

Run ui (dev)

yarn start-ui

Limitations (Mainnet)

  • As currently compiled circuits support 10 rows, 5 columns, 5 AND expessions, 2 OR expessions.
  • Operators >, <, >=, <= in WHERE expression are possible but disabled to reduce latency.