Together we win! 🇮🇱
I volunteer in a project where I've developed a generative transliteration pipeline, and optimized it for Arabic/Persian names, as explained 👇.
Contact me if you need help (details below). If you're an 🇮🇱 defense organization or a voluntary project I can share it as a donation, and if your data is sensitive I can donate my static map of transliterated ~20k unique name parts that can be used to transliterate millions of Arabic full names.
Transliteration: converting text from one language to another to sound the same,
e.g. שלום
➡️ Shalom
.
Translation: converting text from one language to another to mean the same,
e.g. שלום
➡️ Hello
.
Arabic full names are chains of names.
Therefore, I developed a pipeline that is optimized for Arabic/Persian names,
as demonstrated in notebooks/transliteration.ipynb
:
- Splitting unique full names to unique name parts.
- Cleaning: dropping name parts that don't contain any Arabic/Persian letter, cleaning surrounding characters that are not Arabic/Persian.
- Transliterating name parts, building a static map.
- Transliterating all full names using the map, and my
transliterate
function.
Example:
- Input: 'בדיקה: محمد (النبطية-بئر) جواد رضوان نصار'
- Output: 'בדיקה: Mohammed (Alnabatieh-Bir) Jawad Ridwan Nassar'
I'm an AI Expert & Hands-on Consultant in Vision, NLP, LLMs, OpenAI (gpt with vision, speech, TTS), ML, DL, Data Science... You can find/contact me on https://www.linkedin.com/in/oz-livneh/.