seancarroll/victors

Default Publisher

seancarroll opened this issue · 6 comments

Victors default publisher is the NoopPublisher which as the name suggests is a noop which isnt very useful. Currently callers have to set a publisher for every experiment which isnt a great developer experience. Is there a way to allow them to provide a default publisher that would be use for each experiment?

Initial thought was perhaps adding a publisher field to Victors so that users could create an instance of Victor to pass around and use. To make this work we would need to add methods similar to the corresponding associated functions. Rough concept looked something like the following

pub struct Victor<'a, R: Clone + PartialEq> {
    // TODO: do we need to make `dyn publisher` implement clone trait?
    publisher: Option<Box<dyn Publisher<R> + 'a>>, // TODO: make this an Option
}

/// Helper that will instantiate and experiment and call run for you
impl<'a, R: Clone + PartialEq> Victor<'a, R> {

    pub fn conduct_controlled<F>(&self, name: &'static str, experiment_block: F) -> VictorsResult<R>
        where
            F: Fn(&mut Experiment<'_, R>) -> VictorsResult<()>,
    {
        let mut experiment = Experiment::new(name);
        if let Some(publisher) = &self.publisher {
            experiment.result_publisher(Box::new((**publisher).clone()));
        }
        experiment_block(&mut experiment)?;
        return experiment.run();
    }

    pub fn conduct_unc<F>(
        &self,
        name: &'static str,
        return_candidate_result: &'static str,
        experiment_block: F
    ) -> VictorsResult<R>
        where
            F: Fn(&mut UncontrolledExperiment<'_, R>) -> VictorsResult<()>,
    {
        let mut experiment = UncontrolledExperiment::new(name);
        if let Some(publisher) = &self.publisher {
            experiment.result_publisher(Box::new((**publisher).clone()));
        }
        experiment_block(&mut experiment)?;
        return experiment.run(return_candidate_result);
    }

}

The issue I ran into is how to move publisher from Victor struct to experiment. My understanding is that cloning trait objects is unsafe and I would like to not have to deal with any unsafe logic. There is a crate that does handle this scenario https://github.com/dtolnay/dyn-clone. If possible I would also like to avoid having to add another dependency. This isnt a hard rule but if I can find an alternative that would allow for me not to include unsafe code and/or add another dependency I would prefer it.

Two more things I would like to explore

  1. Maybe similar to the above with a slightly different spin is to see if the factory pattern might work here. General idea is that callers would create a factory that would create experiments with the appropriate publisher set and then configure the experiment as they typically would.
  2. I wonder if I could define a trait with an associated type which callers would implement and provide the specific publisher.

Another option is to do something similar to how OpenTelemetry does their Global Trace API which provides applications access to their configured trace provider instance from anywhere in the codebase. See the following references

  1. https://github.com/open-telemetry/opentelemetry-rust/blob/f20c9b40547ee20b6ec99414bb21abdd3a54d99b/opentelemetry-api/src/global/mod.rs
  2. https://github.com/open-telemetry/opentelemetry-rust/blob/f20c9b40547ee20b6ec99414bb21abdd3a54d99b/opentelemetry-api/src/global/trace.rs

Specifically they allow callers to override the global trace provider via set_trace_provider

I'm not exactly sure of all the implementation details but would certainly work for our use case albeit a simpler factory pattern or trait might be sufficient

Experimenting with the global exporter option and running into an issue because Publisher has a generic parameter.

/// Represents the globally configured [`Publisher`] instance for this application.
#[derive(Clone)]
pub struct GlobalResultPublisher<R: Clone + PartialEq> {
    publisher: Arc<dyn Publisher<R>>,
}
impl<R: Clone + PartialEq> GlobalResultPublisher<R> {
    /// Create a new GlobalResultPublisher instance from a struct that implements `Publisher`.
    fn new<P>(publisher: P) -> Self
        where
            P: Publisher<R> + 'static,
    {
        GlobalResultPublisher {
            publisher: Arc::new(publisher),
        }
    }
}

// this is complaining because of the generic parameter R
static GLOBAL_RESULT_PUBLISHER: once_cell::sync::Lazy<RwLock<GlobalResultPublisher<R>>> = once_cell::sync::Lazy::new(|| {
    RwLock::new(GlobalResultPublisher::new(
        NoopPublisher{},
    ))
});

I wonder if I could define a trait with an associated type which callers would implement and provide the specific publisher.

Not sure how I feel about this but below is an implementation of this option. I extracted out a trait Scientist defined as

pub trait Scientist<'a, R: Clone + PartialEq + Serialize> {
    type PUB: Publisher<R> + 'a;

    fn conduct<F>(name: &'static str, experiment_block: F) -> VictorsResult<R>
    where
        F: Fn(&mut Experiment<'_, R>) -> VictorsResult<()>,
    {
        let mut experiment = Experiment::new(name);
        experiment.result_publisher(Self::get_publisher());
        experiment_block(&mut experiment)?;
        return experiment.run();
    }

    fn conduct_uncontrolled<F>(
        name: &'static str,
        return_candidate_result: &'static str,
        experiment_block: F,
    ) -> VictorsResult<R>
        where
            F: Fn(&mut UncontrolledExperiment<'_, R>) -> VictorsResult<R>,
    {
        let mut experiment = UncontrolledExperiment::new(name);
        experiment.result_publisher(Self::get_publisher());
        experiment_block(&mut experiment)?;
        return experiment.run(return_candidate_result);
    }

    fn get_publisher() -> Self::PUB;

}

which will allow users to define their own scientist with their own pubisher like the following

pub struct PrintPublisher;
impl<R: Clone + PartialEq + Serialize> Publisher<R> for PrintPublisher {
    fn publish(&self, result: &ExperimentResult<R>) {
        println!("{}", serde_json::to_string(result).unwrap());
    }
}

struct Reed;
impl<'a, R: Clone + PartialEq + Serialize> Scientist<'a, R> for Reed {
    type PUB = PrintPublisher;

    fn get_publisher() -> Self::PUB {
        return PrintPublisher{};
    }
}

and then used like

let r = Reed::conduct("conduct test", |experiment| {
    experiment.control(|| 1)?;
    experiment.candidate(|| 2)?;
    Ok(())
});

In order to support a global Publisher looks like I need to remove the generic R parameter so I plan to have the publisher take in a Value enum, similar to serde and opentelemetry. I'll keep the generic parameter as the return type from Victor / Experiment but will convert to the Value enum to pass to the publisher