Google engineers working on extracting house numbers from street view images have recently got their algorithms to the point where they can beat Captchas.
House numbers are of course a good case as there’s no real rules in most places about font, orientation and so on. And of course character shapes can be somewhat distorted in the images.
At the same time engineers at Evernote have blogged about indexing handwritten images - a seemingly dry topic, but of course if you have a bunch of scanned handwritten notes you need to do some text recognition and extraction in order to identify meaningful text in order to index it.
Putting the two posts together comes up with some intriguing possibilities. There’s some agreement that the best way to recognise handwriting is to decompose it to the individual letters and do shae recognition - even if like me, you have handwritting distorted by years note scribbling, the letter shapes you use are fairly consistent - and being able to cope with distortion better will find the shapes more easily. At the same time Evernote’s approach is interesting as it effectively does look ups of sequences of letters to guess word boundaries to work out which letters belong together.
Putting the two together not only do we have the potential of better handwriting OCR, but also the possibility of doing text extraction from hard-to-read handwritten documents - and here I’m thinking about documents such as medieval tax rolls which have not usually been digitised or transcribed.
Given that both Evernote and Google contribute to a range of open source projects there’s more than a chance that we’ll see the technologies becoming available reasonably soon ….
Written with StackEdit.