OCR

OCR (Optical Character Recognition) is the technical term for the automatic recognition of printed characters using optical rasterization (e.g. by scanners or digital cameras). Simply speaking OCR is trying to have printed text transcripted by a computer.
The father of OCR is said to be Lawrence Roberts who conducted first experiments on automatic text recognition on the MIT in 1960. First practical appliances of OCR as hardware solutions appeared in 1965. Back then the recognition was limited to specially designed fonts like OCR-A and OCR-B. In 1976 Ray Kurzweil developed the first omnifont, i.e. font independent OCR system. With increasing computer performance software-based OCR solutions have gained more and more importance since the mid-eighties.
OCR can be divided into the processing steps scanning, layout analysis, segmentation, character recognition and dictionary lookup with more and more vanishing boundaries between these steps in modern systems. Typical applications of OCR are document recognition, archiving systems and forms processing.