Process Marketing Job not writing parquet file to S3
enr1c091 opened this issue · 2 comments
Hi,
I am running this sample and for some reason that I can't figure out why, the process_marketing_data.py isn't writing the output file to S3 and the Count: log in CWL returns 0. Therefore, the Join step fails since it can't infer schema to the parquet file.
You should upload the sales sample data to
aws-etl-orchestrator-demo-raw-data/sales and marketing sample data to
aws-etl-orchestrator-demo-raw-data/marketing
For example:
aws s3 ls s3://aws-etl-orchestrator-demo-raw-data --region ap-northeast-1 --profile us-east-1 --recursive
2019-12-26 17:39:42 0 marketing/
2019-12-26 17:43:36 151746 marketing/MarketingData_QuickSightSample.csv
2019-12-26 17:42:55 0 sales/
2019-12-26 17:43:51 2002910 sales/SalesPipeline_QuickSightSample.csv
Like @liangruibupt pointed out. Project readme updated with instructions for copying the datasets.