pacman82/odbc2parquet

Getting the SQL statement from stdin

leo-schick opened this issue · 4 comments

I would like to have an option to pass the SQL statement to the odbc2parquet query command via stdin, via an extra option (e.g. --query-from-stdin) or by passing - as SQL query statement.

Proposed shell parameter design

Option 1 by passing -

cat my_sql_query.sql \
  | odbc2parquet query -c "<connection_string>" my-output.parquet -

Option 2 by using a new parameter --query-from-stdin

cat my_sql_query.sql \
  | odbc2parquet query -c "<connection_string>" --query-from-stdin my-output.parquet

Hello @leo-schick ,

could you help me understand your motivation for doing so? Would reading the query from a file also be an acceptable solution to your problem?

Cheers, Markus

I plan to extend the Mara ETL framework with a db query to parquet function and would like to use odbc2parquet for this.

I’ll do this by extending the mara_db.shell.copy_to_stdout_command function. See also mara/mara-db#56

Mara is a nice ETL framework which does the data processing via shell tools instead of processing the data in python itself. This gives an extra performance boost compared to other ETL engines relying on e.g. pandas or pyarrow

I’ll do this by extending the mara_db.shell.copy_to_stdout_command function. See also mara/mara-db#56

Just looking at this you may also like odbcsv.

Hello @leo-schick ,

odbc2parquet 0.9.4 accepts a query from standard in if you pass - as a query string.

Cheers, Markus