Electrical & Electronic

In this work a new method of signature recognition with neural network is proposed. The features are extracted from two raw databases of ATVS signature database: one consisting of 25 signature samples of 350 persons and other 46 signatures of 25 persons. The features include 9 features computed by DS Guru and HN Prakash and proposed features are no of pen downs, magnitude of average velocity, magnitude of average acceleration and length to height ratio. Signature features are pre-processed with a scaling method and brought to a value having same decimal point. A feed forward neural network is trained using back propagation learning method. With each features removed, an accuracy rate is calculated to check the feature which will be better for signature verification. Accuracy of recognition up to 98% and 89% are obtained using signature samples of 10 persons from each database respectively.


Introduction
Authentication of user is becoming very important to do business transactions, accessing data and for security purpose. Many different techniques are applying for authentication purpose. User IDs and passwords, PIN codes, ATM card, PAN card are many different ways which are common today, but the problem of such systems are that, they need remembering different PINs or passwords, carry such items and they need to be kept secret from others. Signature is a behavioral biometric. Automatic signature authentication is now becoming popular in research areas because of its acceptance in legal and social areas and its widespread use for authentication purpose.
Since signature for different individuals vary with the variation of individuals, so it is a very robust biometric to authenticate a user. Signature verification is a very difficult pattern recognition problem. Since intra class variations occur, even experts get difficulty to recognize the forgery signature. And also it is not very difficult to forge a signature. Signature is believed to be a reflex action which produces its dynamic properties unconsciously.
Biometrics can be broadly divided into physiological and behavioral biometrics. Some examples of physiological biometrics are fingerprint, iris and face; and behavioral biometrics is signature, voice and hand writing. Authentication of signature is done by detecting forgeries. Forgeries can be divided into random forgery, simple simulated forgery and simulated skilled forgery. Figure 1 shows an example of forgery signatures. Random forgeries are produced randomly without any information about the name and person for whom signature is produced. Random forgery are generated when forger do not have any available access to the signature. They may have different shape and size from the authentic signature. Simple simulated forgery may have same semantic meaning like authentic one but overall shape and size may differ. Skilled simulated forgery signatures are the signatures which are given by a large number of practices.
For online signature verification, signature data are generally taken using capacitive tablet or personal data assistance (PDA) which gives the x-y coordinate, pressure readings etc. From these raw data different features can be computed. Signature authentication problem can be solved by two ways: dynamic and static, which needs computation of dynamic and static features respectively. Dynamic measurement of signature features can be obtained by electronic tablet or PDA; static features can be computed from images obtained either by camera or scanning the photo of the signatures. Dynamic features [1] are functions of time and static features are time independent. Even if a skilled forger produces the same looking signature like the authentic one, they cannot easily learn to produce the same pressure produced by the authentic one. Hence dynamic features help to detect forgery. The use of pen dynamics over shape of signature would be more useful in forgery detection because dynamic features of a signature are not readily available to forger as in the shape of offline signature.
Handwritten signatures can be represented by multiple modals i.e. global and local, shape based and time based. Local shape based signature and their advantages are discussed in Automatic On-Line Signature Verification by Nalwa [2]. Three signature databases are collected and analyzed for a different period of time. Different reasons for inaccuracies are also discussed. Global feature based technique is applied in Signature Verification method [3]. Three global features i.e. projection moment, upper envelope based characteristics and lower envelope based characteristic are used and then multiple neural network is applied for the classification purpose. To remove noise they have applied median filter. A fusion of local and regional features is discussed in Fusion of Local and Regional Approaches for On-Line Signature Verification by Aguilar [4]. The local function based features are classified with dynamic time warping (DTW) and regional features are classified with hidden markov model (HMM). In Pattern recognition [5], signature verification based on logarithmic spectrum is done. Principle components of the logarithmic spectrum are compared with the reference signature and similarity value is calculated between the enrolled and reference signature. A stroke based method for shape and dynamics of signature is discussed in Multiexpert Verification of Hand-Written Signature by Bovino [6]. Two level strategies using soft and hard rule are implemented in it.

Abstract
In this work a new method of signature recognition with neural network is proposed. The features are extracted from two raw databases of ATVS signature database: one consisting of 25 signature samples of 350 persons and other 46 signatures of 25 persons. The features include 9 features computed by DS Guru and HN Prakash and proposed features are no of pen downs, magnitude of average velocity, magnitude of average acceleration and length to height ratio. Signature features are pre-processed with a scaling method and brought to a value having same decimal point. A feed forward neural network is trained using back propagation learning method. With each features removed, an accuracy rate is calculated to check the feature which will be better for signature verification. Accuracy of recognition up to 98% and 89% are obtained using signature samples of 10 persons from each database respectively. The signature authentication is done in two types of problems: signature verification and signature recognition. In verification, the features of the test signature are compared with a few number of stored features of the signature of the claimed person, and need to verify if the signature belongs to that particular person or not. But in recognition the features of the test signature are compared with a few no of stored features of a no of persons and we have to recognize whether the test signature belongs to the one of the enrolled persons and identify the person.
Online signature verification can be broadly classified into two groups based on their feature extraction method: parametric approach and function based approach. In parametric approach a set of parameters (e.g. Speed, displacement, position, pen up pen down, wavelet transform etc.) extracted can be used as a feature to form a signature pattern, and those feature patterns can be used as reference and test signature to examine the authentication of the signature. In function based approach the features are the function of time (e.g. velocity, acceleration, pressure, direction of pen movement etc.). Online signatures are characterized as a time function.
In any verification task there are two types of error involved i.e. false rejection and false acceptance. False rejection occurs when the authentic signature is rejected and false acceptance occurs when a forged signature is accepted to be authentic. When percentage of false rejection rate (FRR) is equal to the percentage of false acceptance rate (FAR) we call it equal error rate. Equal error rate is the measure of the performance of a biometric system. Average error rate is the average of FAR and FRR. The authenticity of test signature is evaluated by matching it with that of reference signature. There are many techniques available for matching e.g. dynamic time warping (DTW) [7], Hidden Markov model (HMM), support vector machine (SVM) and neural network (NN). When functions are considered, the matching technique must take into account the variation of duration of signature. A method of similarity measure for signature verification and recognition using symbolic representation is done [8]. Here the following are discussed: 1. Dynamic time warping technique is generally used for function based parameter. But the time complexity of DTW is more of the order of (O 2 ).
2. HMM performs stochastic matching using probability distribution of the features. They can compute both similarity and variability of the pattern. But they require a large dataset to train and are complex. 3. Support vector machine classifies one class of data from the other by finding the hyper plane that maximizes the separation between classes. SVM have algorithmic complexity, and requires large storage for large scale task 4. Neural network have ability of generalization. They can be used to detect nonlinear equations for dependent and independent variables. They can train large amount of database. Easily implemented in parallel architecture.
In [9] a method of string matching or dynamic time warping is done. Here local features and stroke based global features are extracted and the results are compared, by varying the values of absolute and relative speed of the signature. Then signature feature vectors are formed in symbolic interval value and test data are inserted to check if it lies within that interval or not. After that interval value is set by calculating mean, variance and standard deviation. Finally writer dependent and feature dependent threshold are set in the database.
Many applications of artificial neural network are there. A work using autoregressive artificial neural network is done [10]. Here a monthly inflow forecasting is done using static and dynamic autoregressive artificial neural network and both radial and logsigmoid activation function is carried in this work. A 47 years of discharge is used, with 42 years of discharge is used for training and remaining 5 years of discharge is used for forecasting. In the neural network architecture, weights are initialized by Nguyen and Widrow's weight initialization method and it is giving a good result with a hidden layer having 17 neurons. From this work it is found that, the dynamic autoregressive artificial neural network is better than the static autoregressive artificial neural network. When results are compared with radial and log-sigmoid function, log-sigmoid function has given a superior result. In the successive work [11], the monthly inflow of Taleh Zang station is carried on. A comparison using both static and dynamic model of autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA) and autoregressive artificial neural network is done with the same database. And the results are compared by comparing the root mean square error (RMSE) and mean square error (MSE). From this work, it was found that, the ARIMA model is better than the ARMA model and dynamic autoregressive model is better than the static autoregressive model. From the result it can be concluded that, the ARIMA model can be used for forecasting of inflow for next 12 months and the autoregressive artificial neural network model can be used to forecast the inflow for next 5 years. In their next work [12] a comparison of performance ability of autoregressive static and dynamic artificial neural network is done for monthly inflow forecasting with the same network parameters. From this, it is found that, neural network architecture is very advantageous for data analysis of large database.

Feature Extraction
From ATVS signature sub corpus, two sets of database are collected. First database contains 25 genuine signature samples for each user. In the second database each users have 46 samples each. The databases contains raw data values of the signature i.e. x-coordinate, y-coordinate, time stamp, pen up pen down and pressure signal. From these data the features are extracted. Initially from these raw data 33 features were computed. For training in neural network feature selection is done manually where some features, which give better results were kept. Features are introduced as:

Total duration of signature
It is the time taken to complete a signature. It can be calculated as the difference between the last time stamp and the first time stamp.

Number of pen ups
The number of times pen is removed from the pad/paper.

Sign changes of dx/dt and dy/dt
dx/dt and dy/dt may be positive or negative value. So when it changes the value from positive to negative or negative to positive it is counted.

Average jerk
Jerk is change in acceleration with respect to time. Average jerk is the mean of the jerk.

Standard deviation of velocity in y-direction:
Standard deviation of v y ; Where, velocity in y-direction: v y = dy/dt (1)

Standard deviation of acceleration in y-direction
Standard deviation of a y Where, acceleration in y-direction: a y = dv y /dt (2)

Number of local maxima in x direction
Local maxima can be calculated from change in x with respect to time.

Standard deviation of acceleration in x-direction: standard deviation of a x ; where
Velocity in y-direction: v x = dx/dt Acceleration in y-direction: a x = dv x /dt (3)

Standard deviation of velocity in x-direction
Standard deviation of v x ; where Velocity in y-direction: v x = dx/dt (4)

Length to width ratio
It is the ratio of length of the signature to the width of the signature i.e. ratio of number of sample point covered by the signature in x-coordinate to number of sample point covered by the signature in y-coordinate.

Number of pen downs
Number of times pen touches the pad/paper to complete a signature.

Average magnitude of velocity
Velocity at every sample points changes. Average velocity is the mean of the velocities at every sample point.

Average magnitude of acceleration
Acceleration also changes at every sample points. Therefore average acceleration is the mean of the acceleration at every sample points.

Methodology
The main motive of this work is to find some features suitable for signature verification and to check the performance of the neural network with those features. Here we have used neural network back-propagation algorithm for training a network. In neural network approach, the main procedure to implement is: first of all the features need to be extracted and then the network is to be trained to learn the relationship between the pattern and its class. After training, validation of the network is to be checked by few features to see if the network is giving a satisfactory result or not. After validation, the network is to be tested using features which are completely unknown for the network, to see the performance of the network. Data bases are collected from ATVS signature sub corpus [13]. The reason of using two dataset is to test the generalization capability of the network. Generally it is easy for the network to generalize more with more sample data. In second database sample data is more so it should give more accurate result. Now, first database i.e. dataset I consists of 25 signature data for each individual. Second dataset i.e. dataset II consist of 46 signature sample for each individual. From the raw data set consisting of information of x coordinate, y coordinate, time stamps, pressure and pen up and pen down, features were extracted. Then the databases are divided into three parts for training, validation and testing. From dataset I, 15 signatures were extracted for training, for validation next 5 signatures were extracted and for testing also remaining 5 signatures were extracted. From dataset II, 30 signatures were extracted for training, for validation next 8 signatures were extracted and for testing remaining 8 signatures were extracted. After division of data, they are randomized. The matching technique used is neural network approach. At first the extracted data are brought to 10 th decimal point. Then the data is normalized so that the pattern values are between 0 and 1. During training weights are updated to minimize the difference between the desired output and the actual output i.e. error. The fixed weights after training can be used for the task in pattern recognition and classification. The neural network structure used have three layers: one input layer, one hidden layer and one output layer. Two models are designed for the comparison purpose. The first model with 9 features have number of input nodes 9, hidden layer nodes 60 and the output layers have 10 nodes. The second model with 13 features have 13 input nodes, 60 hidden layer nodes and 10 output layer nodes. For the second dataset the number of hidden layer nodes taken is 80. Activation provides the measure of confidence of corresponding decision of the classifier. Commonly used activation functions are sigmoid functions. The activation function used here is log sigmoid function (logsig). It gives the activation label between 0 and 1. If activation is 1 it means confidence is high and if it is 0 means confidence is zero.

A. Vector matrix form of back propagation algorithm
1. An input pattern is presented and calculated the outputs of the network at all the internal layers.

2.
For each of the layers, the sensitivity vector is calculated according to D (s) = G(v (s) )(d q -x out (s) ) for output layer

Result
Neural network is a generalization tool. The reason why the neural network approach is chosen among the number of other classification method is that, neural network is easy to use and can solve complex problems with ease. From this work it is realized that, when variation of data is more, neural network finds it difficult to generalize. That is why normalization of database is important. When introduced preprocessing of data by converting them to 10 th decimal point or same decimal point value the generalization becomes more and classification error decreases. Confusion matrix is a table which helps the visualization of the performance of supervised machine learning. The diagonal boxes in the table from left (up) to right (down) gives the true positive classification. And other boxes show the true negative classification. Figure 2 is showing the confusion matrix of genuine signatures of first dataset. The data consist of 15 genuine signatures of 10 users. From this confusion matrix, the true positive rate found is 98% i.e. false rejection rate (FRR) is 2%. Figure 3 shows the confusion matrix of forgery signatures of first dataset. The data consist of 5 forgery signatures of 10 users. From this confusion matrix, when forgery signatures were taken, the false acceptance rate is 8%.
From the confusion plot the accuracy of the system can be calculated by: Accuracy = (100 -(FAR + FRR)/2) % (8) The accuracy rate of the system is 95 % Table 1 show the result obtained from dataset I using all the features extracted. And Table 2 shows the result obtained from dataset I and dataset II using all the features extracted and also from removal of one or more features. Table 2 gives the comparison of the accuracy of result with 9 and 13 features respectively for different number of classes (users). Table 3 it is clear that features i.e. Average jerk, Standard deviation of acceleration in y direction, Standard deviation of acceleration in x direction, Standard deviation of velocity in x direction and average acceleration are not very suitable in case of dataset I. And in case of dataset II Time duration of signature and Average acceleration are not suitable. From the final accuracy result it is found that, result of dataset with more sample number gives more accurate result, which reflects the generalization capability of neural network. From Table 3 it is observed that when features i.e. length to width ratio, number of pen downs, average magnitude of velocity and average magnitude of accelerations are added with previous 9 features, the network gives a better accuracy. It can also be observed that as the number of classes increases the accuracy decreases in both the cases. It implies that the features mentioned are not very good when database with more number of classes are taken.

Conclusion
The main objective of this work is to construct a signature recognition system using some feature values so that to get a maximum accuracy label. To do this, some features were extracted and formed pattern from them. The features acts as an input pattern to the neural network and corresponding targets are constructed. In neural network, the patterns are trained according to the target, where weights are updated to get a minimum error. When a stopping condition is reached the iteration stops. In neural network training, many times trial and error is done to get satisfactory neural network architecture. Parameters like hidden nodes in a neural network, initial learning rate parameter, learning rate schedule are needed to be adjusted again and again. Neural network architecture is dependent on these parameters. In this work the nodes in the hidden layers for the first dataset is 60 and second.
Dataset is 80. Initial learning rate parameter is 1 and learning rate scheduled at 300. These parameters give a false acceptance rate of 8% and a false rejection rate of 2%. And accuracy rate when calculated gives an accuracy rate of 95% for the dataset I. And for the dataset II an accuracy of 89% is obtained using the same parameter. When experiment is performed taking different number of classes, it is found that with increase in the number of classes the accuracy is decreasing. It     Time duration of signature, Average acceleration ----80 is because of the fact that, the features considered in the experiment are not very suitable for database with large number of classes.
Since neural network has a generalization capability, once trained its weight need not be changed again. In testing it gives the result from the trained architecture itself. The main drawback of neural network training is that, for larger dataset it is very difficult to adjust the parameters by trial and error method. And the time consumption is more.