Text capture comprises the automatic recognition of printed characters using
optical rasterization (e.g. by scanners
or digital cameras). Simply speaking text capture is trying to have printed text transcripted by a computer.
The father of text capture is said to be Lawrence Roberts who conducted first experiments on automatic text recognition on the MIT in 1960. First practical appliances of text capture as hardware solutions appeared in 1965. Back then the recognition was limited to specially designed fonts like OCR-A and OCR-B. In 1976 Ray Kurzweil developed the first omnifont, i.e. font independent text capture system. With increasing computer performance software-based text capture solutions have gained more and more importance since the mid-eighties.
Text capture can be divided into the processing steps scanning, layout analysis, segmentation, character recognition and dictionary lookup with more and more vanishing boundaries between these steps in modern systems. Typical applications of text capture are document recognition, archiving systems and forms processing (see FormPro).