openvstorage/volumedriver

Differentiate between real runtimerrors and client timeout errors

JeffreyDevloo opened this issue · 6 comments

Problem description

Certain volumedriver calls have a set timeout within the framework but the framework is unable to differentiate between a real problem and a timeout due to the fact that they both raise a runtime errors with an xmlrpc message

I'd suggest raising a timeout exception that the framework can import and check upon

Differentiation is already possible (but code inspection suggests a bug): if the node does not respond (= not running, or on timeout), a ClusterNotReachableException or a NodeNotReachableException is raised.
The latter is however not correctly translated from C++ to Python (to be fixed as part of this ticket? Please let me know if you want this) which means it's currently reported as a RuntimeError "failed to send XMLRPC request".
At the moment there is no distinction possible between timeout and e.g. voldrv not running (the library used does not offer this).

I assume fixing the bug will allow Framework to know it is a timeout in openvstorage/framework#1557 and act accordingly?

Yes. For now / backward compatibility it might be a good idea to identify connection / timeout issues reported via the "failed to send XMLRPC request" message of a RuntimeError.

So all "failed to send" are timeouts? We could work with that but having a cleaner option (like a custom exception) would be better

Timeouts or a problem contacting the node (differentiation of these requires changes to the library used / using another library). Mind, this is just to work around the bug in the C++ -> Python exception translation / to get the functionality with old volumedriver instances; fixing the exception translation should still be done of course.

Milestone closed.