Google's
Tesseract seems to be just about the best OCR out there. It doesn't seem to play well with others yet (it's written on the assumption that it's a standalone utility, not a library) but given that it's Google, it'll probably get a lot better fast.
I should probably investigate. OCR is an important component of a lot of translation jobs, and all existing OCR sucks. Sigh. That's only partly hyperbole.
No comments:
Post a Comment