delta-io/delta-sharing

Load profile.json exception

cic1988 opened this issue · 1 comments

Hello experts,

I followed the protocol example to build the reference server. The server generated the presigned URL when table/query endpoint is called.

Assumed that my table_url is profile.json#share.schema.table.

By using df = delta_sharing.load_as_pandas(table_url, limit=3) it loads the data well. But it has failed if I use load_as_spark.

Following code:

from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("Delta Share Demo") \
    .config('spark.jars', 'packages/haddop-azure-3.3.6.jar,packages/delta-sharing-spark_2.12-0.6.4.jar') \
    .getOrCreate()

...

import delta_sharing
df = delta_sharing.load_as_spark(table_url)
df.limit(2).select("path").show()

In the error, it shows:

java.lang.RuntimeException: delta-sharing:/profile.json%23share.schema.table/123/25169076 is not a Parquet file. Expected magic number at tail, but found [0, 20, 14, 55]

Have you seen the error before?

@cic1988 sorry haven't seen it before.
Is this still happening?
Do you have a full stack trace?