GNU Ocrad 0.28 text recognition system

21 Jan 2022 5:33 am GMT+0000 Date Time

After three years since the last release of formed The release of the text recognition system OCRAD 0.28 (Optical Character Recognition) developed under the auspices of the GNU project. OCRAD can be used both in the form of a library to integrate the OCR functions to other applications and in the form of a separate utility, which, based on the image transmitted to the input, gives text in UTF-8 or 8-bit encodings.

For optical recognition in OCRAD, the method is used to select the symptoms ( feature extraction ). The composition includes a page layout analyzer that allows you to correctly separate columns and text blocks in print documents. Recognition is supported only for characters from the encoding “ASCII”, “ ISO-8859-9 ” and “ ISO-8859-15 “(Cyrillic support is missing).

It is noted that a large portion of small corrections and improvements is included in the new release. The most significant change was the support of the PNG image format, implemented using the libpng library, which significantly simplified work with the program, since only images could be given in the PNM .

/Media reports.