The shape of handwritten characters

https://doi.org/10.1016/S0167-8655(03)00017-5Get rights and content

Abstract

A handwritten character magically survives serious distortions in size, orientation and even structure, justifying perhaps its Sanskrit name––Aksharam, the undecaying. Several traditional approaches model characters in terms of certain shape features like line crossings, T-junctions etc. But there is no sanctity in the choice of these features––which may be specific to a script––nor is there a limit to their number. We address the general problem of defining the shape of a 2D line diagram, with character as a significant special case. To this end we develop a framework based on a branch of mathematics known as the Catastrophe theory. A small set of 11 shape features is derived systematically from our framework. The 11 features are found in several of world’s scripts and may in fact be universal. More complex shapes break down to the above 11 in handwritten scripts. We discuss how our model can be applied to on-line character recognition from pen-based devices.

Introduction

Template-based approaches might be feasible for optical character recognition (OCR), but may not be adequate for handwritten character recognition (HCR). Handwritten characters survive drastic distortions introduced by handwriting, and yet enable recognition by human readers. What is it about a character that survives significant structural injury, facilitating recognition? A vague answer to this difficult question is: shape. It is not easy to give a precise, unique, mathematical definition of shape. Shape is qualitative, not quantitative. It cannot be expressed in mensurational quantities like precise lengths, angles etc. It is not even a question of topology since a number of uppercase English alphabets (e.g. C, G, L, M, N, S, U, V, W, Z) are topologically equivalent and may not be classified by topological considerations.

How then may one characterize shape? Character recognition research often addresses this question of shape by extracting certain shape features from characters. These features include corners (Mehrotra et al., 1990), line-crossings (‘X’), ‘T’ junctions, high curvature points and other shape primitives (Fischler and Wolf, 1994; Li and Yeung, 1997; Sanfeliu and Fu, 1983; Teh and Chin, 1989). But often the choice of these features is ad hoc and script dependent. And one is never certain if a given set of features is minimal or redundant. The search for points that mark significant shape-related events on a planar curve led to the invention of a variety of special points like anchorage points (Anquetil and Lorette, 1997), salient points (Fischler and Wolf, 1994), dominant points (Teh and Chin, 1989), singular points (Rocha and Pavlidis, 1994) etc. In (Fischler and Wolf, 1994) an algorithm for locating perceptually salient points on curves is given; the authors claim that the algorithm identifies points that are found to be salient by human observers also. Scale-space arguments have also been used to locate perceptually significant points (Witkin, 1983; Kadirkamanathan and Rayner, 1990). Outside character recognition research, such special points have been used for curve partitioning, 2D and 3D object representation and recognition, for scene segmentation and compression, object motion tracking etc.

A host of distinguished points have been proposed in the context of character recognition also. In (Rocha and Pavlidis, 1994), characters are described as graphs where the edges are straight lines or convex arcs. Some of the nodes are designated as singular points and character identification is done using flexible graph-matching techniques. A procedure for elastic matching of curves is described in (Burr, 1980). Pavlidis et al. describe a method for curve matching using physics-based transformations (Pavlidis et al., 1998). Curvature maxima are used to segment cursive script in (Kadirkamanathan and Rayner, 1990). Peaks and valleys of x and y profiles are used to capture signature shape in (Gupta and Joyce, 1997). The sequence of occurrence of these points along the signature generates a symbol string, which may be used for identification using string matching techniques. But this approach is restrictive since peaks and valleys (i.e. simple maxima and minima) are not the only features than determine the shape of a smooth function; there is an infinite hierarchy of higher-order features starting with inflexion points (Gilmore, 1981).

Therefore we find that, in a majority of cases outside character recognition domain, the salient or special points defined to segment planar curves are based on high curvature or some intuitively related property. In the context of character recognition, the salient points proposed are often ad hoc and are deduced from a specific script by heuristic considerations. It would be interesting to see if it is possible to present an exhaustive list of salient points, which can be used to represent any script of the world. For it is our belief that though there may be an infinite variety of characters in various scripts, it might be possible to reduce them to some sort of shape components which are universal. However, for such a program to succeed one must obtain a fundamental insight into the shape, not just of characters, but of line diagrams in general, of which characters are significant instances.

A systematic treatment of the problem of shape can be found in a branch of mathematics known as Catastrophe theory (CT), a branch of Singularity theory. CT aims to formally explain the origin of forms or shapes in Nature (Thom, 1975; Gilmore, 1981; Lu, 1976; Poston and Stewart, 1978). In this theory, the problem of describing forms in Nature reduces, in mathematical terms, to the problem of describing the shape of smooth functions. The profound results of CT have been used to explain a range of forms in Nature ranging from the breaking of a swelling wave, the streaks of light seen in a coffee cup (known as “light caustics”), the splash of a drop on a liquid surface etc. (Thom, 1975; Gilmore, 1981). The theory has been applied to a variety of problems in engineering and physics (Gilmore, 1981; Poston and Stewart, 1978). CT had been used to classify the local shape of 3D volumes (Koenderink and van Doorn, 1986). CT has been used to distinguish phantom edges from real ones in a scale-based approach to edge extraction (Clark, 1988). CT has also been applied in a new style of function approximation in which the goal of approximation is to model only the shape of the target function (Chakravarthy and Ghosh, 1997).

In this paper we present a mathematical framework for defining the shape of a character. We argue that the global shape of a character is determined by a set of local shapes. The local shapes, which are few in number, combine variously to give rise to a great diversity of characters. We show that the local shapes can be systematically derived from our framework. The paper is organized as follows. Section 2 briefly introduces some of the basic ideas of CT, which happens to be the starting point to our theory. Section 3 introduces our framework; a small set of local shapes, is also defined in the same section. The theory is used to represent handwritten characters in Section 4. The work is summarized in the final section.

Section snippets

The shape of smooth functions––Catastrophe theory

With a view to explain the origin of forms in nature, Rene’ Thom invented CT, a branch of singularity theory (Thom, 1975; Gilmore, 1981; Lu, 1976; Poston and Stewart, 1978). According to CT the overall shape of a smooth function, f(x), is determined by special local features like “peaks”, “valleys” etc. Mathematically, these features are characterized by the points where the first, and probably some higher, derivatives vanish. Such points are known as the critical points (CP). There exists a

The shape of handwritten characters

A handwritten character is formed gradually by a sequence of hand strokes. A stroke is defined as the movement of pen between the moment when contact is made with the paper, and the moment when the contact is broken. Therefore, each stroke of the pen traces a finite, continuous line segment on the paper, which may be described as:X=X(t),Y=Y(t),t∈[t0,t1]where X(t) is the x-(horizontal) coordinate of the pen, and Y(t) is the y-(vertical) coordinate of the pen. The parameter, t, takes the natural

Discussion

This work is primarily intended as a departure from the practice of proposing ad hoc, script-dependent shape descriptors. It moves towards this goal by presenting a systematic framework for describing the shape of line diagrams in general, with characters coming in as special cases. We found such a framework in CT, which classifies the local shapes of parameterized smooth functions. Adapting CT, we have presented a framework in which character shape may be formally described. As in CT, global

Conclusions

A general theory of the shape of line diagrams is presented. The theory is used to represent handwritten characters. The SPs represent the smallest units (“atoms of shape”) which may be viewed as building blocks of a larger shape, say, as that of a handwritten character. The theory is not developed with any specific script in mind; the shape points are derived from a general mathematical framework. The theory explains several of the distortion patterns of handwritten characters. The notion of

References (21)

  • S.V. Chakravarthy et al.

    Function emulation using the radial basis function network

    Neural Networks

    (1997)
  • R. Mehrotra et al.

    Corner detection

    Pattern Recognition

    (1990)
  • I. Pavlidis et al.

    An on-line handwritten note recognition method using physics-based shape metamorphosis

    Pattern Recognition

    (1998)
  • Anquetil, E., Lorette, G., 1997. Perceptual model of handwriting drawing: Application to the handwriting segmentation...
  • Burr, D., 1980. Elastic matching of line drawings. In: Proc. 5th Internat. Conf. on Pattern Recognition, Miami Beach,...
  • J.J. Clark

    Singularity theory and phantom edges in scale-space

    IEEE Trans. Pattern Anal. Machine Intell.

    (1988)
  • M.A. Fischler et al.

    Locating perceptually salient points on planar curves

    IEEE Trans. Pattern Anal. Machine Intell.

    (1994)
  • R. Gilmore

    Catastrophe Theory for Scientists and Engineers

    (1981)
  • Gupta, G.K., Joyce, R.C., 1997. A Study of Shape in Dynamic Handwritten Signature Verification. Technical Report,...
  • M. Kadirkamanathan et al.

    A scale-space filtering approach to stroke segmentation of cursive script

There are more references available in the full text version of this article.

Cited by (30)

View all citing articles on Scopus
View full text