Interact with Delta Lake through a RESTful API.
DeltaREST is a Python library that allows you to easily launch inside your Spark driver process a Flask -based server exposing a RESTful API to interact with Delta tables.
(Help and pull requests are very welcome !)
pip install deltarest
from deltarest import DeltaRESTService
from pyspark.sql import SparkSession
# Create local SparkSession
SparkSession \
.builder \
.appName("local_deltarest_test") \
.master("local") \
.config("spark.jars.packages", "io.delta:delta-core_2.12:0.8.0") \
.getOrCreate()
# Start service on port 4444
DeltaRESTService(delta_root_path="/tmp/lakehouse-root") \
.run("0.0.0.0", "4444")
Notes: When deployed on cluster:
delta_root_path
could be a cloud storage path.- deploy the spark app using
client
deployMode.
curl -X PUT http://127.0.0.1:4444/tables/foo
Response code 201
.
{
"message":"Table foo created"
}
On already existing table identifier:
curl -X PUT http://127.0.0.1:4444/tables/foo
Response code 200
.
{
"message":"Table foo already exists"
}
curl -X POST http://127.0.0.1:4444/tables/foo --data '{"rows":[{"id":1,"collection":[1,2]},{"id":2,"collection":[3,4]}]}'
Response code 201
.
{
"message": "Rows created"
}
curl -G http://127.0.0.1:4444/tables
Response code 200
.
{
"tables":["foo"]
}
curl -G http://127.0.0.1:4444/tables/foo
Response code 200
.
{
"rows":[
{"id":1,"collection":[1,2]},
{"id":2,"collection":[3,4]}
]
}
On unexisting Delta table
curl -G http://127.0.0.1:4444/tables/bar
Response code 404
.
{
"message":"Table bar not found"
}
Must only involve listable delta tables.
curl -G http://127.0.0.1:4444/tables --data-urlencode "sql=SELECT count(*) as count FROM foo CROSS JOIN foo"
Response code 200
.
{
"rows":[
{"count":4}
]
}