rom1504/cc2dataset

Implement restarting the spark app every part

rom1504 opened this issue · 3 comments

or maybe every N parts

spark slows down for too long running apps

either find a way to close the context, either just run inside a process

definitely needed
starts at 1000s per part (25k wat)
a few hours later, 3000s per part

with per part session, it will stay at 1000s per part hence 3x faster

done