openwpm/OpenWPM

Continue CommandSequence after Exception in GetCommand

Closed this issue · 5 comments

Currently, the OpenWPM stop running subsequent command if an error occurs in the previous commands.
I added a custom command which will set data entries inside an implemented table for each website. The problem is this custom command should be executed after GetCommand which might face an error, so some gaps appear in the table making it hard to compare tables from different crawls (with the same Sites set).
Is there any way to force my command to be executed even after an error in previous commands?

Hey, thanks for bringing this interesting use case to my attention.
Currently I can't think of a way to do this, unless you are willing to give up on parallelization, run each CS blocking and then execute the code from the custom command in your own script. (This assumes you don't need access to the things you get in the execute method)

I think there is a way to implement this behaviour. However I wonder, what the best way is.

  1. Uninterruptible CS - No matter if a command fails, the BrowserManager just keeps executing the next one
  2. Shielded commands - Any command can be marked as shielded so that if it fails, the CS keeps getting executed
  3. Modified GetCommand - This approach doesn't seem fruitful, as we can't catch timeout errors this way.

Thanks for the answer.
For me, I am using the third one since it doesn't need to change the OpenWPM source code.
However, I think the second approach is the more flexible and scalable one.

If you have time until this Friday I'll draft up approach 2 and publish it on a branch.

Sure thanks, I am not in a hurry!

I think I never got around to this. And looking back at this, I don't think ignoring errors is a common enough use-case to justify this feature.