yahoo/streaming-benchmarks

Seen.txt and updated.txt are empty

Nachiket90 opened this issue · 12 comments

I am trying to setup yahoo streaming benchmark for one of my assignments. I was able to run the benchmark suite and see results on console. I was expecting results in seen.txt and updated.txt in data dir but as mentioned in the README of project. In my case those files are always empty. I might have made a mistake in setup but can you guys help to resolve it and get the results in updated.txt/seen.txt.

I even to tried to run https://github.com/dataArtisans/yahoo-streaming-benchmark but here as well files were empty after execution.

This seems to indicate that data did not show up in redis like expected, of the tool could not find redis to get the data from. How are you trying to run the benchmark?

I have downloaded zip version of benchmark from github and copied it on CentOS server. I am trying to run benchmark as,
./stream-bench.sh SPARK_TEST

That is odd. I'll try to reproduce it and see what I can come up with.

Could you reproduce it?

I was able to reproduce it, but only after making a bunch of changes to the script to have it download the correct things. (Spark and Flink both removed packages) also it seems to only be happening for spark. From what I can tell spark is not writing anything into redis at all, so the files are actually accurate. I will have to do some more digging to see what might be happening.

OK I saw the issue with flink too, but storm seems OK. This is really odd, but because we had to get newer versions of both spark and flink to get a release that is available for download there might be something there. More likely it is something with scala 2.11 which I also had to upgrade, but I will try and look at them.

Thanks for the updates. Please suggest/share if you have a solution for this issue.

Could you identify the root cause for the issue and solution?.
My team is planning to run benchmarks against spark and because of that I need a solution for this issue.

I have not been able to identify it yet, but I honestly have not tried that hard and have a lot of other priorities right now. I hopefully will find some time to dig in tomorrow.

Is there any idea? I've got the same issue

I have the same issue when running Flink tests only!

For flink looks like the issue happens because of requested operator parallelism. Setting parallelism to default (1), it works
Screenshot from 2020-03-14 11-42-17