greyblake/nutype

Ability to derive `fake-rs`'s `Dummy` trait

sidrubs opened this issue · 7 comments

fake-rs is commonly used to generate fake data for testing purposes. Any struct that implements the Dummy trait can easily be generated with fake data. They also have a derive macro to derive the impl of the Dummy trait for a particular struct.

It would be super cool to have fake-rs available under a feature flag to easily impl the Dummy trait for a nutype struct.

If you are open to having this added to nutype I would be happy to create a PR. I would just need to be pointed in the correct direction to get started.

At the moment I am working on adding a support for Arbitrary to enable fuzzy and property based testing.

My first question: could Arbitrary be an alternative for Dummy?

Thanks for offering your help, I appreciate.
The problem is that to implement support for something like this is harder than it maybe seem at first glance.

In order to generate a valid value, one has to respect all possible combinations of sanitizers and validators (guards), which means, essentially it means that it's not possible to implement one single implementation, and every combination of guards needs to be handled separately.

After all, it's also not possible to support that fully, because of the custom sanitizers and validators, which cannot be supported.

Interesting, Arbitrary could be an alternative for generating random types for unit testing. However, the nice thing about Dummy, in particular, is that you can automatically derive Dummy for a parent struct if all of its fields implement Dummy.

Yeah, I see why you say it would be complicated 🤔 Could one require the user to add a specific fake expression as an argument to the macro to generate something that would pass validation (similar to how fake specifies field specific generation with field attributes)?

How are you planning on doing validation for Arbitrary?

@sidrubs

in particular, is that you can automatically derive Dummy for a parent struct if all of its fields implement Dummy.

Same with Arbitrary.

Could one require the user to add a specific fake expression as an argument to the macro to generate something that would pass validation (similar to how fake specifies field specific generation with field attributes)?

It could, but from the effort perspective it's the same as asking users to implement the trait themself, which they can do obviously already.

How are you planning on doing validation for Arbitrary

The point it to correctly respect all the sanitization and validation rules.
If you're very curious this is the implementation for the integer-based types: https://github.com/greyblake/nutype/blob/master/nutype_macros/src/integer/gen/traits/arbitrary.rs

But that was a low hanging fruit. E.g. Strings have much more edge cases.

I looked a bit more into Arbitrary, it is pretty cool!

But that was a low hanging fruit. E.g. Strings have much more edge cases.

Oh yeah, strings are going to be interesting.

For my understanding I tried to manually implement the Dummy trait for a simple Email nutype struct. I get the following error.

error[E0423]: expected value, found struct `Email`
  --> src/main.rs:32:22
   |
32 |     let fake_email = Email.fake();
   |                      ^^^^^- help: use the path separator to refer to an item: `::`

For more information about this error, try `rustc --explain E0423`.

I'm assuming that it has something to do with how nutype prevents creating a struct directly (to avoid doing validation) or is there something I am doing wrong with implementing the Dummy trait. (I know you didn't write fake but maybe you can point me in the correct direction).

Below is the code I used:

main.rs

use lazy_static::lazy_static;
use nutype::nutype;
use regex::Regex;

use fake::faker::internet::raw::SafeEmail;
use fake::locales::EN;
use fake::{Dummy, Fake, Rng};

lazy_static! {
    static ref EMAIL_REGEX: Regex = Regex::new(
        r"^([a-z0-9_+]([a-z0-9_+.]*[a-z0-9_+])?)@([a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6})"
    )
    .unwrap();
}

#[nutype(
    validate(regex = EMAIL_REGEX),
    derive(Debug, Clone),
)]
pub struct Email(String);

/// Implement [`fake`]'s [`Dummy`] trait for the [`Email`] struct
impl Dummy<Email> for Email {
    fn dummy_with_rng<R: Rng + ?Sized>(_: &Email, rng: &mut R) -> Self {
        let raw_email: String = SafeEmail(EN).fake_with_rng(rng);

        Email::new(raw_email).unwrap()
    }
}

fn main() {
    let fake_email = Email.fake();

    println!("{}", fake_email.into_inner())
}

cargo.toml

[package]
name = "impl-dummy-poc"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
fake = "2.9.1"
lazy_static = "1.4.0"
nutype = { version = "0.4.0", features = ["regex"] }
regex = "1.10.2"

@sidrubs I haven't used fake crate before (though I know the concept from the Ruby world), but it looks what you doing does not match their docs.

These corrections make it work

Import Faker:

use fake::{Dummy, Fake, Rng, Faker};

Then

First:

    let fake_email: Email = fake::Faker.fake();

Second:

impl Dummy<fake::Faker> for Email {

Full working code:

use lazy_static::lazy_static;
use nutype::nutype;
use regex::Regex;

use fake::faker::internet::raw::SafeEmail;
use fake::locales::EN;
use fake::{Dummy, Fake, Rng, Faker};

lazy_static! {
    static ref EMAIL_REGEX: Regex = Regex::new(
        r"^([a-z0-9_+]([a-z0-9_+.]*[a-z0-9_+])?)@([a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6})"
    )
    .unwrap();
}

#[nutype(
    validate(regex = EMAIL_REGEX),
    derive(Debug, Clone),
)]
pub struct Email(String);

/// Implement [`fake`]'s [`Dummy`] trait for the [`Email`] struct
impl Dummy<fake::Faker> for Email {
    fn dummy_with_rng<R: Rng + ?Sized>(_: &Faker, rng: &mut R) -> Self {
        let raw_email: String = SafeEmail(EN).fake_with_rng(rng);

        Email::new(raw_email).unwrap()
    }
}

fn main() {
    let fake_email: Email = fake::Faker.fake();

    println!("{}", fake_email.into_inner())
}

That gets me going for now, so am going to close this issue. But if you ever want help trying to integrate fake into nutype's derive system, let me know.

@sidrubs Glad to help!
In you place I would utilize macro_rules! if you going to implement similar scenarios a lot.