TileDB-Inc/TileDB-Py

DataFrame `from_csv` doesn't support alternative s3-compatible backend

p4perf4ce opened this issue · 0 comments

Whenever we use .from_csv for whatever backend (alternate s3 endpoint, minio), the custom configuration that works on usual Array operation doesn't work in this case.

if mode != "append" and tiledb.array_exists(uri):
raise tiledb.TileDBError("Array URI '{}' already exists!".format(uri))

Here, I saw that tiledb.array_exists(uri) isn't under scope_ctx(ctx) which cause all s3:// uri to call AmazonS3 instead of user-configured custom backend.

File ".venv/lib/python3.11/site-packages/tiledb/dataframe_.py", line 868, in from_csv
    if mode != "append" and tiledb.array_exists(uri):
                            ^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/tiledb/highlevel.py", line 149, in array_exists
    if tiledb.object_type(uri) != "array":
       ^^^^^^^^^^^^^^^^^^^^^^^
  File "tiledb/libtiledb.pyx", line 3506, in tiledb.libtiledb.object_type
  File "tiledb/libtiledb.pyx", line 356, in tiledb.libtiledb.check_error
  File "tiledb/libtiledb.pyx", line 350, in tiledb.libtiledb._raise_ctx_err
  File "tiledb/libtiledb.pyx", line 335, in tiledb.libtiledb._raise_tiledb_error
tiledb.cc.TileDBError: [TileDB::S3] Error: Error while listing with prefix 's3://localhost:9999/test-bucket/sparse-sample/__schema/' and delimiter '/'[Error Type: 15] [HTTP Response Code: 403] [Exception: AccessDenied] [Remote IP: 54.231.225.66] [Request ID: P7SBVXAXY300G9P9] [Headers: 'content-type' = 'application/xml' 'date' = 'Wed, 15 Nov 2023 06:11:47 GMT' 'server' = 'AmazonS3' 'transfer-encoding' = 'chunked' 'x-amz-bucket-region' = 'us-east-1' 'x-amz-id-2' = 'jSRRHRQVDXOvnZ9aMxXBeL4vq5yp9wm3VErfm8xyS9iHkozkqKeFgQzK5t1JPfxh/YSBkXfYdLo=' 'x-amz-request-id' = 'P7SBVXAXY300G9P9'] : Access Denied

Environment

Python 3.11.5 (main, Aug 24 2023, 15:09:45) [Clang 14.0.3 (clang-1403.0.22.14.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tiledb
>>> tiledb.version()
(0, 23, 4)

OS: Debian-bookworm aarch64