Custom plugin for datacraft to generate values using regular expressions.
Uses the rstr package. xeger
is regex backwards. Inspiration from the
original Java Package xeger.
You can use the xeger
as a type in your datacraft data specs. See example:
{
"ssn":{
"type": "xeger",
"data": "\\d{3]-\\d{2}-\\d{4}"
}
}
$ datacraft -s xeger.json -i 3 --format json-pretty -x -l error
[
{
"ssn": "322-81-1469"
},
{
"ssn": "697-21-8178"
},
{
"ssn": "340-78-5377"
}
]
Users can make use of the datacraft_xeger module to create custom datacraft value suppliers with regex patterns. The example below shows how to register custom types for different country phone number patterns.
import datacraft
import datacraft_xeger.suppliers as xeger
phone_patterns = {
# type_name: pattern
'uk-phone': r'\+44 \d{4} \d{6}',
'aus-phone': r'\+61 4\d{2} \d{3} \d{3}',
'nz-phone': r'\+64 \d{2} \d{4} \d{4}',
# ...
}
@datacraft.registry.types('uk-phone')
def _custom_regex_uk_phone(spec, loader):
return xeger.xeger_supplier(phone_patterns['uk-phone'])
@datacraft.registry.types('aus-phone')
def _custom_regex_aus_phone(spec, loader):
return xeger.xeger_supplier(phone_patterns['aus-phone'])
@datacraft.registry.types('nz-phone')
def _custom_regex_nz_phone(spec, loader):
return xeger.xeger_supplier(phone_patterns['nz-phone'])
Once registered these types can be used as part of the data generation process. See the example data spec:
{
"name": ["ann", "bob", "carl"],
"age": { "type": "rand_int_range", "data": [25, 75]},
"phone": {
"type": "weighted_ref",
"data": {
"UK": 0.5, "AUS": 0.3, "NZ": 0.2
}
},
"refs": {
"UK": { "type": "uk-phone" },
"AUS": { "type": "aus-phone" },
"NZ": { "type": "nz-phone" }
}
}
Running datacraft against this spec and using the custom code loading feature:
datacraft -s custom.json -c custom.py -i 3 --format json-pretty -x -l warn
[
{
"name": "ann",
"age": 67,
"phone": "+64 07 2500 7403"
},
{
"name": "bob",
"age": 49,
"phone": "+61 435 126 947"
},
{
"name": "carl",
"age": 61,
"phone": "+44 7693 148185"
}
]