Real-time human identification using a pyroelectric infrared detector array and hidden Markov models

: This paper proposes a real-time human identification system using a pyroelectric infrared (PIR) detector array and hidden Markov models (HMMs). A PIR detector array with masked Fresnel lens arrays is used to generate digital sequential data that can represent a human motion feature. HMMs are trained to statistically model the motion features of individuals through an expectation-maximization (EM) learning process. Human subjects are recognized by evaluating a set of new feature data against the trained HMMs using the maximum-likelihood (ML) criterion. We have developed a prototype system to verify the proposed method. Sensor modules with different numbers of detectors and different sampling masks were tested to maximize the identification capability of the sensor system.


Introduction
A biometric system is an intrinsic pattern recognition system that ensures personal identification by evaluating the authenticity of a specific physiological or behavioral characteristic possessed by the subject.In conventional biometric systems, the complex structure of certain body parts (e.g. a human iris, human fingerprints, face, or hand geometry) are measured optically, analyzed digitally, and a digital code is created for each person.Recent advances in optical and digital technologies, biometric sensors, and matching algorithms have led to the deployment of biometric recognition systems in a variety of security application [1].
When a human walks, the motion of various components of the body, including the torso, arms, and legs, produces a characteristic signature.Human walking motion is a complex process and it is difficult to decouple the individual biomechanical contributions in a motion cycle for an analysis.From the thermal perspective, each person acts as a distributed IR source whose distribution function is determined by the shapes and the IR emissions of the components [2].Combined with the various idiosyncrasies in how an individual carries himself, the human thermal signature will impact a surrounding sensor field in a unique way.The average human frame radiates about 100 W/m 2 of power, which peaks at 9.55 m μ [3,4].
There is a constant heat exchange between a human body and the environment due to the difference in their temperatures.
The Pyroelectric infrared (PIR) detector is sensitive in a range of 8~14 m μ and is able to detect humans within a fairly reasonable distance (<15 m).PIR detectors have been used in a wide variety of applications [6][7][8].In our previous study, we used a PIR detector whose visibility was modulated by a Fresnel lens array to capture an analog feature of human walking motion [9,10].The spectra of the sensor response data generated by a human walking along a fixed-path were used to distinguish individuals.In this paper, a PIR detector array with masked Fresnel lens arrays is utilized to generate digital sequential data that can represent a human motion feature.This digital feature based system using PIR detectors is insensitive to the velocity over the angular velocities between 1.1 rad/s and 3.1/s [8].A feature model is based on the statistics of the on-off patterns of the sensor array for a walker who can walk at different speeds during the training stage.
Although IR cameras with large numbers of pixels are also capable of advanced positioning and control, they are inevitably associated with high data-loads, computational costs, and much higher system costs.Sensor systems based on the PIR detectors with coded masks, on the other hand, can achieve the desired identification capability at low data-loads, computational and system costs.These pyroelectric detectors are available in either single element or dual element versions.A single element detector responds to any temperature changes in the environment and therefore needs to be thermally compensated to remove sensitivity to ambient temperature.In this study, we used the dual element PIR sensors.Dual element detectors have the inherent advantage that the output voltage is the difference between the voltages obtained from each of the elements of the detector which subtracts out environmental effects [5].Therefore, the performance of this human identification system is robust to the environmental temperature, and suitable for both indoor and outdoor working environments.
Hidden Markov models (HMMs) are a widely used tool for sequential data modeling.Although the basic theory and inference tools were developed in the late 1960s [11,12], HMMs have been extensively applied in the last decade to such applications: speech recognition [13], DNA and protein modeling [14][15][16], handwritten character recognition [17,18], gesture recognition [19], and behavior analysis and synthesis [20].In this study, we use HMMs to model the digital features generated by a sensor module.An example sensor module with modulated visibilities is illustrated in Fig. 1.The sensor array is distributed vertically, in an expectation that each sensor can capture the thermal dynamics of a different part of a walker.The geometry is actually 1-D and the sensor is not sensitive to anything but entry and exit from the detection region.The sensor module can sample the IR fields produced by humans and convert the pyroelectric response signals into digital sequential data.For each registered subject, an HMM is built during the training phase.In the testing phase, the set of trained HMMs are then used to estimate the identity likelihoods of a newly generated signal, for either path-dependent or path-independent human identification.Figure 2 outlines the identification process.It has two phases: training and testing.In the training phase, we construct an HMM for each registered subject.In the testing phase, the association likelihoods of an unknown sequence with a set of trained HMMs are estimated and the identity of the subject is then obtained by choosing a model with the maximum likelihood value.

Hidden Markov Models (HMMs)
Hidden Markov models (HMMs) can be characterized by a set of output distributions and a finite-state Markov chain.A first order HMM is defined by the following elements: M: the number of observation symbols; N: the number of states; T: the length of the observation sequence; { } i π Π = : the initial state probability distribution, where 1 ( ) For convenience, we can compactly denote the model parameter set by ( , , ) an HMM can be completely specified by λ .
For the observation evaluation, let P O λ can be solved using forward-backward procedure in terms of forward and backward variables which are defined as follows: Forward procedure: ( ) t i α can be solved inductively: 2. Induction: Backward procedure: β can be solved inductively: 1. Initialization: 2. Induction: Finally, the observation evaluation can be written as

Model training
In the training phase, the task is to find the model parameters that can fit best a set of training data.In our study, we use the expectation-maximization (EM) algorithm to find the maximum-likelihood (ML) estimate of the parameters of a HMM, given a set of observed feature sequences.This process is also known as the Baum-Welch algorithm.It can be described as follows: (i) given an initial guess of ( , , ) (ii) the re-estimated algorithm and O are used to derive a new model ( , , ) (iii) replace λ by λ and repeat the re-estimation.
In order to describe the procedure for estimation of HMM parameters, we first define ( , ) t i j ζ : the probability of being in state i at time t and state j at t+1, given the observation sequence O and the model λ , is defined by We also define ) The re-estimation process iterates until the increase in ( | ) P O λ is small enough.The Baum-Welch algorithm is guaranteed to increase ( | ) P O λ with the re-estimated A, B and Π until the optimal point is reached [21].

Multiple Hypothesis Testing
After the model training process, we will obtain K HMMs if there are K registered subjects.Therefore, for an unknown observation sequence X, we will have K hypothesis { 1 2 , ,..., k λ λ λ } to test.Our HMM-based identification approach adopts the ML criterion, where an unknown sequence X is assigned to the model with the highest testing likelihood.The decision rule is where i λ is the HMM corresponding to the ith registered object.

Feature Generation
The most important aspect of a human identification system is to choose an appropriate feature that can distinguish individuals.In our study, we select the fixed length binary event index sequence generated by a pyroelectric sensor array as the digital human motion feature.
Here, an event is defined as the thermal flux collect by a pyroelectric detector which exceeds a threshold, and can be associated with some specific motions of human subjects, such as moving across one or several adjacent detection regions.The event signals are generated by pyroelectric infrared detectors with periodic sampling masks on Fresnel lens arrays.
Figure 3 shows the experiment setup.A sensor module, which contains 8 PIR detectors, is mounted on a pillar at a height of 80cm to sample the IR radiation from a subject.The sensory data were collected when different persons walked in the field of view (FOV) of sensors.The detector signal is converted to an event signal by signal processing techniques: matched filtering, threshold testing, and low-pass filtering.A threshold value has to be chosen for each detector, proportional to the noise level of that signal channel.If a processed signal's absolute value is larger than the threshold value, the signal value is set to '1', otherwise to '0'.The process of event signal generation is shown in Fig. 4.This system was implemented using the TI's micro-controller (MSP430149) and RF transceiver (TRF6901) module.The sensory data are processed on the embedded microcontroller and the event index sequences are transmitted to the host computer via a wireless channel.Fig. 5 illustrates two 4-bit digital features (event index sequences) generated by two persons.Fig. 6 shows the corresponding decimal sequential signals of Fig. 5.It can be seen that the digital features generated by the two persons are distinctive.
Figure 7 summarizes the procedure of digital feature extraction for the real-time human identification systems.The length of a feature sequence for real-time identification is fixed.When it reaches the preset length, the system resets itself and awaits the next batch of event sequences.These digital sequential data can be modeled in HMMs.The HMM characterizes the statistics of a finite-state sequence of training.The model parameters are initialized by a random guess and updated by the EM algorithm described in the previous section.There are two important parameters for HMM training: one is the number of states; another is the length of training sequences.The model with more states can describe more characteristics of the digital feature of an individual.However, when the number of states is increased, the computation cost will be much higher and an over-fitting problem may occur [22].We choose the state number of an HMM for each individual after testing the identification capabilities with respect to different state numbers.When we increase the length of the training sequences, the identification rate can be improved at the expense of more training time.The selection of length of testing sequences requires a compromise between the identification rate and identification time.In the path-dependent case, we set the length to 2000 for the training sequences, and 200 for the testing sequence.In the path-independent case, we set the length to 3000 for the training sequences, and 500 for the testing sequence.

Path-dependent Recognition
For the path-dependent recognition problem, the sensory data was collected while different persons walked back and forth along a prescribed straight path, 2.5 m away from and perpendicular to the sensor.The experiment setup is shown in Fig. 3.A sensor module, which contains 4 or 8 PIR detectors and Fresnel lens arrays, is mounted on a pillar at a height of 80 cm to sample the IR radiation from the human target.The range of vertical field of view of the sensor module (8 PIR detectors) is 53~136 cm from the ground.Within this range, the sensor module can detect IR radiation from torsos, arms, and legs of normal-height humans at the same time.A more detailed discussion on the sensor module location can be found in our previous paper [9].
If the walker belongs to a predefined set of known walkers, it is referred to as closed-set identification.Adding a "none-of-the-above" option to closed-set identification gives open-set identification [23].Fig. 8 (a) shows the experimental results of the verification (open-set identification).The digital features of five walkers were tested against one person's HMM.It turns out that the person's features can not achieve the maximum log-likelihood all the time.However, when we use the digital features of that person to check against all five persons' HMMs (closed-set identification), the maximum likelihoods can always be achieved for that person's HMM, that is, correct identification, shown in Fig. 8 (b).Therefore, it suggests that the proposed HMM approach is only suitable for the closed-set identification case.
The poor performance of HMM approach in the verification case might be caused by the intrinsic statistical instability of the digital feature.The short testing sequences contain less statistical information and more uncertainty, whereas the HMMs, derived from much longer training sequences, contain enough statistical information to make reliable statistical inferences.Longer testing sequences might improve the system performance in verification and open-set identification.However, from the practical point of view it is not realistic to collect a long digital sequence to recognize a person.Therefore, in this paper we only investigate the case of using the digital feature for closed-set identification.Lens Array Visibility We assume that the mask with high spatial sampling can capture more detailed IR information generated by human motions.Fig. 9 shows two sensor modules with different sampling masks.Including the sensor module shown in Fig. 1, we have 3 types of sensor modules containing 4 detectors for path-dependent identification.These sensor modules with different spatial sampling masks are used to create detection regions of different sizes within the sensor FOVs.In the training stage, we constructed 10 HMMs for 10 persons.In the testing stage, each person was tested 20 times.The number of walkers along a fixed path to be identified is increased from 2 to 10 for each type of the sensor module, tested against 10 feature models obtained from the training stage.The average path-dependent identification rates of the three different sensor modules with respect to the group size are shown in Fig. 10.We can see that the sensor module with high spatial sampling frequency has the best identification performance and the average identification rates decrease when the group size grows from 2 to 10.

Lens Array Visibility
To improve the identification capability, we increased the number of PIR detectors in the sensor modules to sample more information in the IR field.Fig. 11 shows three different sensor modules with 8 detectors using different periodic sampling masks.With more detectors, we can obtain binary digital features in a higher dimension.The dynamic range of the observation for the HMM becomes 0~255 (8-bits).Fig. 12 shows the average identification rates of the three different sensor modules with respect to the person number.We can see that the model 8H has the best performance.It can achieve an average identification rate above 90% for a small group of 10 persons.

Path-independent recognition
For the path-independent case, we used the same sensor setup as in the path-independent case.Each person in a group of 10 walked randomly inside a 9m× 9m room.We used the mask model 8H to capture the digital features for the path-independent identification case.Because of the randomness in paths, longer training sequences and testing sequences are needed.Fig. 13 illustrates the impact of the length of training sequences and testing sequences on the identification rate.We can see little improvement in average identification rates for lengths of training data beyond 3000.For HMMs derived from training data of length 3000, testing sequences beyond 500 in length does not increase the identification rate.Therefore, for pathindependent recognition we chose the length of 3000 for the training sequences, and 500 for the testing sequences.Table 1 shows the closed-set path-independent identification results for 10 walkers.It can be seen that in identification among 10 walkers the lowest identification rate is 60%, the highest is 95%, and the average is 78.5%.Fig. 14 shows the average Number of persons Identification rate identification rates we obtained.Like in the path-dependent case, the identification rate drops as the size of the group increased in number of people.When the group size grows from 2 to 10, the average identification rate decreased from 92.5% to 78.5%.This human recognition system is based on the IR radiation from the human bodies.Among all the factors that affect the human heat radiation, the cloth that walkers wear is the most important one.From the initial experiment results, the system recognition capability is invariant to the clothes with similar fabric.However, a person wearing clothes with different kind of fabrics (e.g., cotton one for training and then polyester one for testing) will degrade the recognition rate.

Conclusion
In our previous paper [9], the spectrum of a single PIR sensor's temporal signal (analog feature) is used to represent the human motion features.This system is only suitable for pathdependent human identification.In this paper, we proposed a digital feature based system for closed-set human identification.PIR detector arrays are used for generating digital sequential data to represent human motion features.The digital feature's advantages are in its less rigid training process, decreased sensitivity to walking speeds, effectiveness in the pathindependent identification mode, and high data compression ratio for wireless data transmission.
An HMM is constructed for each person by an EM learning process and used as a statistical feature model.A person is identified by testing an unknown digital feature against all the HMMs and selection based on the Maximum-likelihood criterion.Different number of detectors and different sampling masks in the sensor module were also studied to improve identification rates.The identification performance can be improved by increasing the number of detectors and the spatial sampling frequency of the masks.Among all the tested sensor modules, the one containing 8 detector units and a high spatial sampling mask demonstrated the best performance.Its average identification rates for 10 persons are 91% and 78.5%, in path-dependent and path-independent cases, respectively.
Our future work will include better selection of features and algorithms for the open-set, less cloth sensitive human identification and simultaneous multiple people recognition by using multiple sensor modules.

∈
is the observation symbol at time t.Given a model λ and an observation sequence O, the observation evaluation problem ( | )

4 #Fig. 11 .
Fig. 11.Three sensor modules with 8 sensor units and their visibility matrices that define the detection regions of the eight sensors.(a) Model 8L; (b) Model 8M; (c) Model 8H.

Fig. 12 .
Fig. 12.Average path-dependent identification rates as a function of the number of persons for the three types of sensor modules.

Fig. 13 .
Fig. 13.Average identification rates for a group of 10 as a function of the (a) training sequences in different length; (b) testing sequences in different length.

Fig. 14 .
Fig.14.Average path-independent identification rates as a function of the number of persons.

Table 1 .
Closed-set path-independent identification results for 10 walkers.Results Eve Jason Pai Bob Scott John Evan Arnak Mohan Yu