Keywords

1 Introduction

When a user makes notes of a lecture or an idea, a pen and a sheet of paper are often used since he/she can handwrite objects immediately on the paper and their positions and shapes can be freely decided. However, if the handwritten objects have to be edited, the user must delete the handwritten objects with an eraser and rewrite them appropriately. It is a troublesome task. To solve the problem, objects handwritten on a pen based computer are recognized by pattern recognition techniques and they are treated as coded character set [1, 2]. But the user has to correct misrecognized objects manually and a system cannot deal with objects which are outside the scope of pre-defined recognition targets. Therefore Chen [3] has developed an application software where a user can edit handwritten objects by using pen based gestures and they are automatically aligned with maintenance of each object shape without a recognition process. However, the software has the following drawbacks:

  • A user must do many operations to edit an object. For example, 6 and 8 operations are needed for object deletion and object transfer processes, respectively.

  • The accuracy of character segmentation process is not high enough. It causes an error in editing processes.

  • Figures cannot be inserted freely. For example, both a text and a figure don’t exist simultaneously in one line of text.

Therefore, we propose a new system to solve the problems.

2 System Overview

Our system has three modes: character writing, figure drawing, and editing modes. The mode can be selected by tapping the appropriate icon in the sub window (See Fig. 1(a)). In the character writing mode, a user can write any characters along the ruled line in the main window. We call the objects written in this mode character objects. In the figure drawing mode, any objects can be drawn without limitation. We call the objects written in this mode figure objects. In these modes, handwritten objects are represented by one or more strokes. Each stroke is a connected component from pen-down to pen-up and it is represented as a sequence of 2D sampling points obtained by a pen computer device. To distinguish between character objects and figure ones, we prepare a table as shown in Fig. 1(b). The main window is quantized by dividing to equal size blocks, each of which denotes a drawing status. We call the 2D blocks ID table. For character objects, blocks overlapped with a bounding box of each stroke are denoted by −1 in the ID table. For figure objects, blocks overlapped with a bounding box of all handwritten strokes for the time from the beginning of figure writing mode to the end, are denoted by a certain positive integer. We assign the value 0 to the other blocks (That is, all the blocks have the value 0 in initial state). Among the processes, the system does not permit an overlap between blocks with different values except for the blocks of the value 0. In the editing mode, a user can edit handwritten objects by using pre-defined pen gestures and character objects around the edited objects are automatically aligned. If the object positions are changed, contents of the ID table are also updated properly.

Fig. 1.
figure 1

System overview and internal representation for the main window

3 Segmentation of Character Objects

To apply the editing and alignment processes, strokes in character objects must be grouped every character symbol in advance. This segmentation process is applied when the mode is changed from character writing mode to the others or character objects are inserted by using a text input popup dialog box. We assumed that characters handwritten along the ruled line are almost the same size. We propose a segmentation method based on this assumption as follows:

  1. 1.

    If a bounding box of a handwritten stroke overlaps with a bounding box of another stroke, these strokes are merged since they can belong to a same character symbol. In this stage, we call the set of merged strokes character candidate set A.

  2. 2.

    For each candidate of set A, we obtain width and height of bounding box of it. For all candidates of set A, the obtained values are sorted in descending order and the average value of upper half is calculated as an estimate character size \( \upalpha \).

  3. 3.

    Candidates of set A are ordered based on the horizontal position of the left side of the bounding box of the character candidate. Let \( \{ a_{1} ,a_{2} , \cdots ,a_{n} \} \) be the ordered candidates. In the candidates, it is assumed that neighboring candidates can be concatenated as one character. Thus, we calculate all combination of candidates. That is, combination list \( \left\{ {a_{1} } \right\},\left\{ {a_{1} ,a_{2} } \right\},\left\{ {a_{1} ,a_{2} ,a_{3} } \right\}, \cdots ,\left\{ {a_{1} ,a_{2} , \cdots ,a_{n} } \right\}, \left\{ {a_{2} } \right\}, \left\{ {a_{2} ,a_{3} } \right\}, \left\{ {a_{2} ,a_{3} , \cdots ,a_{n} } \right\},\left\{ {a_{3} } \right\}, \cdots ,\{ a_{n} \} \) is obtained. For the width of bounding box of each combination, if the combination has the closest width to the estimate size \( \upalpha \), the candidates which belong to the combination, are merged. The combinations which include the merged candidates, are deleted from the list. This process continues until the combination list is empty. Thus, we obtain the set of merged candidates and we call it character candidate set B.

  4. 4.

    For each candidate of set B, we obtain the width of bounding box of it. For all candidates of set B, the obtained values are sorted in descending order and the average value of upper half is calculated as an estimate character size \( \upbeta \).

  5. 5.

    After the stage 3, small size candidates can be remained. Therefore, for each candidate of set B, if the width of it is smaller than the value obtained by rounding down \( {\beta \mathord{\left/ {\vphantom {\beta 2}} \right. \kern-0pt} 2} \) to the nearest integer, it is concatenated to either left-hand side candidate or right (If the candidate is concatenated to the left-hand side candidate and \( \upbeta \) is closer to the width of the bounding box of them than the width of the bounding box produced by the concatenation of the candidate and the right-hand side candidate, the left-hand side candidate is adopted. Otherwise, the right-hand side one is adopted.) Thus, we obtain a new set of merged candidates and we call it character candidate set C.

  6. 6.

    For each candidate of set C, if the bounding box of the candidate overlaps with bounding boxes of other candidates, they are merged. Finally, a set of merged candidates is obtained and we call it character candidates. Each character candidate consists of one or more strokes.

4 Editing Functions

In the editing mode, a user can edit handwritten objects by using pre-defined pen gestures. 4 types of pen gestures are prepared and each of them are represented by two stroke gestures as shown in Fig. 2. In the figures, the numbers and the arrowed lines denote stroke order and direction, respectively. The type of gestures is recognized based on the positions of start and end points of each stroke.

Fig. 2.
figure 2

Pre-defined pen gestures

4.1 Deletion Process

When two horizontal lines are drawn (see Fig. 2(a)) over a target object, the overlapped strokes with the bounding box of the lines are deleted. In the case of character object deletion, the deleted region is filled with the neighboring handwritten character objects if necessary (see Fig. 3). This character alignment procedure is done based on the information of the precomputed character candidates and the alignment procedure propagate along the ruled lines until an end of character objects or a line feed mark is detected.

Fig. 3.
figure 3

An example of deletion process

When a cross gesture is drawn (see Fig. 2(d)) over a figure object, an ID number is extracted from the ID table corresponding to the position of the intersection point of the two stroke lines and the strokes with this ID number are removed. In this case, the character alignment procedure is done if necessary.

4.2 Insertion Process

When two lines as shown in Fig. 2(b) are drawn over the target character objects, a dialog box is appeared (see Fig. 4). A user can handwrite characters in the box and they are inserted at a position. The position is the closest point of the divided positions for the character candidates to the center position of the starting points of the two lines in the insertion gesture. The character alignment procedure is also applied if necessary.

Fig. 4.
figure 4

An example of insertion process

When the insertion gesture (see Fig. 2(b)) is drawn over a figure object, a user can add any strokes to the original figure object. The system gives the added strokes the ID number which is the same as the number assigned to the original figure object.

4.3 Transfer Process

When a horizontal line is drawn over the target objects, the system displays a circle on the line as shown in Fig. 2(c). If a user draws a line from the circle position to the destination, the target objects are moved to the destination (see Fig. 5). The transfer process is done by using the deletion and insertion processes.

Fig. 5.
figure 5

An example of transfer process

If the first stroke of a gesture is drawn over a figure object, the deleted target is all strokes with ID assigned to the figure object. If the target object is moved over character objects, they avoid the transferred object and the character alignment procedure is applied to them. On the other hand, the first stroke is drawn over character objects, the deleted target is selected every character candidates. The target character candidates are moved to the destination by using the method of Sect. 4.2.

4.4 Other Functions

When a cross gesture as shown in Fig. 2(d) is drawn over a character candidate, a line feed mark is inserted before the candidate and the candidate is moved to the head of the next line if necessary. The system displays a red point at the left-side of the candidate. To release the status, the same gesture can also be used.

As shown in Fig. 1(a), the sub window is equipped with the undo and redo buttons. Using the buttons, a user can undo and redo up to the recent 30 actions.

5 Experimental Results

To evaluate the system concerning the accuracy of the character segmentation and the usability of the proposed editing functions, 10 test users used the system.

First, we describe the evaluation of the accuracy for the segmentation of character objects. Each test user wrote the 29 characters three times. They consist of Japanese Hiragana, Katakana, Kanji characters, and alphabets (block letters) shown in Fig. 6(a). As the result of segmentation process, the average accuracy was 76%. For each test user, the lowest accuracy was 40% and the highest was 95%. If the written characters have almost the same size and each gap between characters is large enough, the high accuracy rate was obtained. On the other hand, the rate tends to go down for the opposite situation since it is difficult to estimate a unified size for all characters.

Fig. 6.
figure 6

An example of segmentation of character objects

Next, to evaluate the usability of the editing functions, test users responded a questionnaire which consists of the following three evaluation items on a 5-point scale (5 is high and 1 is low):

  1. 1.

    Are the pen gestures easy to memorize?

  2. 2.

    Are the pen gestures easy to draw?

  3. 3.

    Are the edit functions processed as expected?

Table 1 shows the result. Thus, for the first evaluation item concerning the cross gesture, the average evaluated value was 3.7. It is relatively low. It is considered that most test users usually drew the two strokes of the cross with opposite order and they were not easy to memorize the gesture. Moreover, the cross gesture has two different functions: Insertion of a line feed mark for a character candidate and deletion for a figure object. Therefore, the test users were confused to use the gesture. For the second evaluation item concerning the deletion gesture, somewhat low value 3.9 was obtained. The cause of this is the misrecognition of gesture types. For the final evaluation item concerning the deletion gesture, the average value 3.7 was obtained. If the two strokes of the deletion gesture are drawn at a short distance, the upper strokes or lower ones for the bounding box of the gesture strokes are remained without user’s intent. This is the reason why the evaluation value was relatively low. However, the other evaluation value is more than 4.0 and the proposed edit functions are practical enough. Furthermore, all the pen gesture can be drawn with only two strokes. It is much simpler than the editing method of the existing software [3].

Table 1. Questionnaire result for three evaluation items

6 Conclusion

We have proposed the new pen gesture-based editing system for online handwritten objects. In the system, to improve the performance of segmentation of character objects, we proposed the method where handwritten strokes are merged properly based on an estimated character size. As the experimental result for 870 characters written by 10 test users, the average accuracy rate 76% was obtained. It is confirmed that the accuracy rate could go up, if sizes of handwritten characters are almost the same and gaps between neighboring characters are large enough.

Next, to simplify the existing editing operations, we proposed the editing methods where we adopted 4 types of pen gestures, each of which consists of only 2 strokes. As the experimental results for the usability evaluation, most test users gave high evaluation marks for all the editing functions.

We also proposed the area management method by using the ID table for the drawing window. As a result, a user can freely transfer a figure object into any text lines while prohibiting existence of overlap regions between character objects and figure ones.

Thus, it is concluded that the system is practical enough for editing of online handwritten objects.

To realize a more user-friendly system, a performance of character segmentation must be improved and the stroke order of writing gestures must be free. In the current system, cursive handwriting cannot be treated properly, so that we plan to solve the problem in future works.