nomemory/mockneat

Feature to be able to generate data in chunks with a seed...

rajib76 opened this issue · 4 comments

This is not a issue but a feature addition. I will try to see if I can add it but please feel free if anyone wants to pick it up. The feature requirement is as below

Lets say that I want to create a creditcard file with 10,000 records. I should be able to create the file in one run or in 5 runs with 2000 records in each run. When I compare the one file with 10000 records and 5 files with 2000 records each, the content should be the same.

I am not sure if this feature is already available in mockneat

@rajib76

I don't think this feature is implemented in MockNeat and thinking of how the library is made, I don't think it will be an easy task to achieve something like this.

Yes I agree, I also went through the code. It may not be possible to implement this. I think the callback creates a new object every time, it will not be possible to create a seed and use it every time. I will think of an alternate solution for this.

Do you recommend this seed feature so you can generate all the time the same data for tests / mock back-ends etc.?

Yes, I think this will be a required feature in certain scenarios. Atleast one I know where I need to test in different environments. If the data size is huge I would not like to copy the data to all environments rather I would run the datagen in all the environments to generate the same data.

I went through the code again. I noticed that the code is using Threadlocal which creates an internal seed and every run this seed will be different. I saw that secure random is there but even if I set it, it defaults to Threadlocal. What I did is, I commented all threadlocal and used securerandom. I then set the seed. Once I did that, I was able to generate the same data in every run. I know secure random will be slower than threadlocal but I think it will be good to have the flexibility to choose between the two.

Please let me know what you think about it. I can look at the code and make the necessary changes to make it flexible to choose between the two.