ETL tool written in Python that provides an specific DSL which is translated to Python script to handle input data.
/(pattern)/ {
// Python code to run when pattern matches
// We can use $1, $2, $N to refer to the matching group
}
or
<csv> {
// Python code to run on every csv row
// We can use $0, $1, $N to refer to csv column index
}
result = []
/(?m:^)(\d{3})-([a-z]+)/ {
result.append({"number": int($1), "description": 'good' if $2 == 'foo' else 'bad'})
}
/(\d+)/ {
result.append({"number": int($1)})
}
print(result)
>python python-etl.py test.etl
123
^Z
[{'number': 123}]
>python python-etl.py test.etl
123-foo
^Z
[{'number': 123, 'description': 'good'}, {'number': 123}]
>type test\payload | python python-etl.py test.etl
[{'number': 123, 'description': 'good'}, {'number': 1337}, {'number': 123}, {'number': 456}, {'number': 777}]
<csv> {
print($1)
}
> python python-etl.py csv.etl
id,name
1,felipe
^Z
name
felipe