Data Crawlers for Simple Optical Character Recognition
Submitted to CEC 2000

Abstract PDF eprint

Many genetic programming systems have been designed to exploit the use of state information in an indirect fashion. In this article we apply a genetic programming technique that directly incorporates state information to a collection of related optical character recognition tasks. Our recognizers are coded as GP-Automata, finite state machines modified by associating a function, stored as a parse tree, with each state. These functions are called deciders and serve to extract information from a high bandwidth input to drive finite state transitions. The GP-Automata make iterated decisions, requesting additional data in an adaptive fashion. This iterated data processing is a form of "crawling through the data" and so we term the software objects data crawlers. These objects can be thought of as expert systems, produced automatically from data by digital evolution. The states for rules with the deciders supplying the "if" part of these rules. We evolve perfect recognizers for three variations of a character set derived from the set of 4-ominoes.