Hu-Fu is the first system for efficient and secure processing of federated spatial queries. The system can parse the federated spatial query written in SQL, decompose the query into (plaintext/secure) operators and collect the query result securely. We also provide a demo on the OSM dataset. In particular, the features of Hu-Fu are summarized as follows. (For more details, please refer to Hu-Fu Technical Report.pdf)
- Efficient and Secure Federated Spatial Queries: Hu-Fu uses novel decomposition plans for federated spatial queries(federated kNN, kNN join, range counting, range query, distance join)(In hufu-core).
- An Easy to Use SQL Interface: Hu-Fu supports query input in SQL format(In hufu-core).
- Supporting for Multiple Silos with Heterogeneous Databases: Hu-Fu supports data federations with >= 2 silos, and each silo can use different spatial databases, e.g. PostgreSQL(PostGIS), MySQL, SpatiaLite, Simba, GeoMesa and SpatialHadoop(In hufu-driver-core).
- Ubuntu 16.04+
- Apache Maven 3.6.0+
- Java 8
- PostgreSQL 10
- PostGIS 2.4
- Python 3.6+
Here we take PostgreSQL(SQL) as an example. For the installation of other systems, please refer to MySQL, SpatiaLite, Simba, GeoMesa, SpatialHadoop
-
Install PostgreSQL (PostGIS)
sudo apt-get install postgresql-10, postgis
-
Clone the git repository
git clone https://github.com/BUAA-BDA/Hu-Fu.git
-
Compile and package the source code(Check folder ./release for packaging result)
cd Hu-Fu/ ./package.sh
In the example below, we will show how to execute federated spatial queries over a four-silo data federation with PostgreSQL.
-
Create user, database and PostGIS extension in PostgreSQL
sudo su postgres psql postgres=# CREATE DATABASE osm_db; postgres=# CREATE USER hufu WITH PASSWORD 'hufu'; postgres=# GRANT ALL ON DATABASE osm_db TO hufu; postgres=# \c osm_db osm_db=# CREATE EXTENSION postgis;
-
Import the data which is sampled from OSM dataset
cd hufu-demo-osm/data-importer/postgresql python importer.py # If a package is missing, install it using 'pip'.
-
Start up drivers(Make sure the socket port used by
hufu-demo-osm/driver/config[x].json
is available)cd hufu-demo-osm/driver ./start_driver.sh 1 2 3 4 # make sure you have compiled and packaged the source code with package.sh
-
Start up command line interface(CLI)
cd hufu-demo-osm/cli ./start_cli.sh # make sure you have compiled and packaged the source code with package.sh
-
Federated Range Query
Hu-Fu> SELECT id FROM osm_a WHERE DWithin(Point(121.5, 14.5), location, 0.5);
-
Federated Range Counting
Hu-Fu> SELECT COUNT(*) cnt FROM osm_a WHERE DWithin(Point(121.5, 14.5), location, 0.5);
-
Federated kNN
Hu-Fu> SELECT id FROM osm_a WHERE KNN(Point(121.5, 14.5), location, 8);
-
Federated Distance Join
Hu-Fu> SELECT R.id, S.id FROM osm_b R JOIN osm_a S ON DWithin(S.location, R.location, 0.2);
-
Federated kNN Join
Hu-Fu> SELECT R.id, S.id FROM osm_b R JOIN osm_a S ON KNN(S.location, R.location, 8);
-
Exit
Hu-Fu> !q
-
-
Sample output of federated spatial query
Due to limited space, we only show the output of federated Range Counting and federated kNN query.
- Federated Range Counting
- Federated kNN query
-
Stop drivers
cd hufu-osm-demo/driver ./stop_driver.sh
If you want to execute queries on your own spatial data, you should modify some configuration files as follows.
-
hufu-demo-osm/data-importer/postgresql/schema.json
: You can modify this file to import other data. You can also create tables and import data by yourself. -
hufu-demo-osm/driver/config[x].json
: You can modify the secure level of each table in drivers. For details ofconfig[x].json
, see Configuration of Driver. -
hufu-demo-osm/client/model.json
: You can change the number of silos in the data federation by modifying this file, see Configuration of CLI.
It is very easy to deploy Hu-Fu on different physical machines. You only need to start up the driver on their respective machines using the script hufu-demo-osm/driver/start_driver.sh
and modify the ip address in hufu-demo-osm/client/model.json
. Note that the driver port should be open on each machine.
To simplify the installation process, the underlying databases of all silos are PostgreSQL(PostGIS) in the example. Recall that Hu-Fu can support heterogeneous underlying spatial databases. We also provide the adapters for other five spatial databases. You can install these spatial databases by referring to documentations.