Recognition-based segmentation of on-line hand-printed words.

M. Schenkel, H. Weissman, I. Guyon, C. Nohl, D. Henderson, Boser B., and L. Jackel.
In S. Hanson et al., editor, Advances in Neural Information Processing Systems 5 (NIPS 92), pages 723--730, San Mateo CA, Morgan Kaufmann.
1993



This paper reports on performance of two methods for recognition-based segmentation of strings of on-line handprinted capital Latin characters. The input strings consist of a time-ordered sequence of X-Y coordinates, punctuated by pen-lifts. The methods were designed to work in \"run-on mode\" where there is no constraint on the spacing between characters. While both methods use a neural network recognition engine and a graph-algorithmic post-processor, their approaches to segmentation are quite different. The first method, which we call INSEG (for input segmentation), uses a combination of heuristics to identify particular pen-lifts as tentative segmentation points. The second method, which we call OUTSEG (for output segmentation), relies on the empirically trained recognition engine for both recognizing characters and identifying relevant segmentation points. Our best results are obtained with the INSEG method: 11% error on handprinted words from an 80,000 word dictionary.



[ next paper ]