IBM/spark-tpc-ds-performance-test

README cleanup

rhagarty opened this issue · 2 comments

@dilipbiswal - good start. Are there any instructions for doing this benchmark by running a notebook in DSX?

Also,

  • can you provide some links in the README when you first reference Apache Spark, Spark SQL, and TPC-DS?
  • can you elaborate some on the TPC-DS benchmark? Would data scientists all know exactly what this is? If not, maybe add some more discussion about why this a important, why they should do this, and what the 99 supported queries are all about.

@rhagarty I am starting to include instructions for notebook. I have added the links for Spark and TPC-DS. Given we have added a link for TPC-DS, do we need to add more about this bench-mark in terms of write-up ? I have added just a few lines of high level description. Let me know.

@dilipbiswal - I like the added description. And the links you added are sufficient.

Thanks for the update.