Automatic image processing of hand-written morse code.
Hand written morse code is not trivial to decode automatically. The dots and dashes are not perfectly aligned, and they vary in size.
This program converts an input image of hand-written morse code to ascii.
The program runs several phases, treating the original image:
Full disclaimer, the photo had some speckles of dust that I manually removed. That is, this image was slightly edited to make things work nicely.
- Adjust contrast: All pixels are set to full white or full black, based on a given threshold.
- Detect lines: purely white lines are flagged and series of white lines used as separators for actual content.
- Pixel drop: For every line of actual content, the line is reduced to a 1D array. If any pixel in the given line's column is black, the array position is also black.
- Statistical analysis: All lines are analyzed and two distribution diagrams are extracted. One for the sign lengths (anything that is black) and one for the break lengths. The outcome is a overlap of two normal distributions, the saddle point is used to distinguish between
_
and.
, likewise for character and word breaks. - All lines are once more parsed. This time the program applies the threshold indicated in the previous distributions, to distinguish between dots, dashes, sign and word separators.
.... . .-.. --- .-- --- .-. .-.. -..
.... --- .-- .- .-. .
-.-- --- ..-
-.. --- .. -. --.
- And finally you can use an online decoder to get back to latin:
HELO WORLD HOW ARE YOU DOING
Looks like I'm not able to correctly spell "Hello" in morse code :)
This is not a reliable / production ready implementation, but a proof of concept.
- It does not always get every character right, but for my own purposes the outcome is good enough to decipher a hand-written longer message.
- The code showcases hard coded magic values:
- Threshold for contrast boost / distinguish black and white pixels.
- Threshold for
.
/_
distinction (currently manually extracted from visual distributions)
- M.Schiedermeier (m5c)