djc/bb8

Handling invalid connections

marcbowes opened this issue · 10 comments

If a user incorrectly specifies their connection setting (e.g. typos the IP address), then bb8's get method will hit errors and retry until the timeout. Is there a way of marking certain errors as "just give up"?

djc commented

There is currently no such way that I'm aware. The question is if this is really what you want -- it seems likely that many of these kinds of errors could be transient, so retrying them before timing out might actually be desirable.

Yeah, I agree with you in general. In my case, we have a shell utility connecting to a db. If the user typos their db name then it takes 30s (the timeout) to return an error. I know logging with the error sink will help, but really, retrying is pointless.

djc commented

Maybe you should just set the timeout lower in your application?

Well, that’s not the only problem. The other is that the error message just says timeout.

Do you think bb8 should not have the concept of an exception that should not be retried on (is terminal for the get attempt and is exposed to the user)?

djc commented

That's a pretty abstract question, to which I don't have a concrete answer. I'm skeptical that there's a straightforward way to implement this that doesn't make things worse for other users and doesn't incur a whole bunch of complexity for what in my opinion is limited benefit.

bb8 is chiefly intended for long-running (server) processes. Optimizing for startup doesn't necessarily make sense. You might just start a connection without bb8 to start with to validate the user's input.

The simplest thing I could think of would be to add

fn should_retry(err: &Self::Error) -> bool

to ManageCollection with a default impl that returns true. That's backwards compatible and doesn't "incur a whole bunch of complexity" in my opinion.

The get API already supports exposing the manager defined Error type:

pub async fn get(&self) -> Result<PooledConnection<'_, M>, RunError<M::Error>> 

I'm probably missing something, though.

djc commented

That seems fine to me, let's have a PR (please add some tests).

I ran into an issue where ManagedConnection::Error is not Clone. The problem arises because both get and ErrorSink want the error by value, so there needs to be a copy made. Currently, only the sink gets it because get always returns TimedOut and never UserError(M::Error). Not sure how to address this without adding a new bound which would not be backwards compatible.

I opened #104 to discuss the path forward.

In #104 I suggest an alternative where the sink does not get the error in a catastrophe, which seems OK to me.