external location in copy with filesystem not works
ZergRocks opened this issue · 5 comments
import duckdb
import fsspec
fs = fsspec.filesystem('jupyter', url='{jupyter lab endpoint}', tok='{jupyter lab token}')
conn = duckdb.connect(":memory:")
conn.register_filesystem(fs)
conn.sql(f"""
copy(
SELECT *
FROM 'jupyter://work/1-12.parquet'
) to 'jupyter://1-13.parquet' (format 'parquet')
""")
The above copy query works well in my Jupiter Lab and processes the parquet in the folder to complete the writing well outside the folder
However, the external.sql
macro in dbt-duckdb
only reads well, and writes are done by creating a folder structure called jupiter:
in my local file system
Maybe the maintenance processing for the fsspec
session was not considered in the phrase that read and write create and insert and write by temp
table, but I couldn't find the exact reason for it at this time.
I was wondering if you could tell me how to solve this part or if you would like to reflect it in PR in the future
also same issue arrised with sftp
filesystem(I tried)
# profiles.yml
dbt_duckdb:
outputs:
dev:
type: duckdb
path: ':memory:'
filesystems:
- fs: jupyter
url: '{jupyter lab endpoint}'
tok: '{jupyter lab endpoint}'
plugins:
- module: delta
target: dev
# dbt_project.yml
name: 'dbt_duckdb'
version: '1.0.0'
config-version: 2
profile: 'dbt_duckdb_delta'
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
clean-targets:
- "target"
- "dbt_packages"
models:
dbt_duckdb_delta:
delta-external:
+materialized: 'external_table'
@ZergRocks happy new year + two questions:
- In the dbt_project.yml the materialization type you have is
external_table
but I think you meantexternal
? I don't have anexternal_table
materialization defined, I assumed that was a typo. - What
location
are you using for the models that you are trying to materialize externally?
(There's a jupyter
(the fsspec impl) vs. jupiter
(the file location on the local file system) potential typo in your description of the problem that made me wonder if the location
field had an issue)
changed from external_table
to external
and set proper location
works..
sorry for making confusion 🙇
I just tried to use this branch #284 wtih jupyter fs and had experienced not work what I expected
now I know I made wrong config to rollback and external & jupyter option works well and have no problem
thanks for checking and sorry again 🙇 @jwills
Hi @ZergRocks, sorry for confusion here, in my branch i made a new materialization "external_table" in order to learn how it has to be done, because i wanted to showcase the delta plugin.
My end goal is to incorporate delta plagin in materialization "external" which is used by all other use cases