finos/greenkey-asrtoolkit

clean up should gracefully handle phone numbers

Closed this issue · 1 comments

Is your feature request related to a problem? Please describe.
Phone numbers in transcripts are mapped to long series of numerals

Describe the solution you'd like
they should be mapped to spelled out single digits

Examples:

1-317-222-222 should map to the series of numbers 'one', 'three', 'one', 'seven', etc.

RE_PHONE = r'(\(?\d{3}\)?[\.\s|\-]?\d{3}[\.\s|\-]?\d{4})'