Performance of pySpark-tiny kernel
Closed this issue · 1 comments
anhnongdan commented
- Driver fails to load the entire 1 file of tc_call_histories for 1 day (need to use broadcast join)
- executor fail to load 10 days of tc_call_histories -> try with loop
anhnongdan commented
The reading is successful with loop.
Though execute time is pretty long.
=> To prevent interruption, write the intermediate file to disk/