After two years of development, the release of the text recognition system ocrad 0.29 (Optical Character Recognition) has been announced under the auspices of the GNU project. OCRAD can be used as a library for integrating OCR functions into other applications or as a separate utility that provides text output in UTF-8 or 8-bit encodings based on the input image. [1] [2]
For optical recognition, OCRAD uses the method of highlighting signs, known as feature extraction. It includes a page layout analyzer to accurately separate columns and text blocks in printed documents. OCRAD supports recognition only for symbols from the “Ascii”, “ISO-8859-9”, and “ISO-8859-15” encodings, with no support for Cyrillic. [3]
In the new version:
- Improved recognition of the drawing of the letter “l” with the inclined right part.
- When using the option ‘-o’ (‘–output’), missing intermediate catalogs indicated in the given file path are now created.
- A variable Makeinfo was added to the assfigure assembly file and makefile.in.
- Diagnostic messages related to file operations have been transformed into the form ‘Program: File: Message’.
- Reports of the use of incorrect command line arguments now include the argument and the name of the option.
Sources: [1] [email protected] [2] gnu.org [3] Wikipedia