A Proposed Data Stream Clustering Method for Detecting Anomaly Events in Crowd Scene Surveillance videos

In this research, a new data stream clustering method utilizing seed based region growing technique is implemented to perform abnormal event detection in anomaly detection system in a new data stream clustering method used in abnormal detection system. This is done by applying HARRIS or FAST detectors on the frames of video clips in two publically available datasets. The first UCSD pedestrian dataset (ped1 and ped2 datasets), and the second VIRAT video dataset system to extract list of pairs of interest points. From these pairs a list of features such as: distance, direction, x-coordinate, y-coordinate obtained to use as an input to the new clustering method. This method in using HARRIS detector achieves detection rates about (9.09%, 52.17%, 61.67%), and the false alarm rates are (18.79%, 36.09%, 66.67%) by using Ped1, Ped2, and VIRAT datasets respectively. For the case of using FAST detector, the best-achieved detection rates are (7.88%, 46.09%, 58.33%), and the false alarms are (21.21%, 40.87%, 63.33%) by using the three previously mentioned benchmarks respectively.


Introduction
The increase in the number of cameras in urban areas like the advent of very high capacity digital video recorders (DVRs), the development of Internet protocol (IP) networks (IP cameras), and the technological advances in the protection of critical infrastructures lead to solve the problem of the incapability to exploit video streams in real time for the purposes of detection or anticipation.It involves having the videos analyzed by algorithms that detect and track objects of interest (usually people or vehicles) over time, and that indicate the presence of

Detecting of Dominant and Rare Event Simultaneously
The dominant behaviors refer to the normal events observed in a scene.These are events that have a higher probability of occurrence than others in the video and generally do not attract much attention.These dominant behaviors are categorized into two classes.The first deals with foreground activities and the second describes the scene background.The second is referred to as background subtraction, but is more general and more complicated than background subtraction, because it includes the scene background while not being limited to it, that learn both normal and abnormal patterns [5].
The information provided by surveillance video feeds represented in various human behavior representations, such as commonly occupied region, human trajectory shape and bag-of-words techniques (code book representation).The first and second are used when the camera is far from the subject, but the last is suitable when human limb movements are observable and also referred to as interest points based methods [9,10,11].
Human behavior features are divided into two major parts: Human global motion and Human local motion features.Human Global Motion features utilize the information on a subject location at various times.It is useful when human limb movements are not observable.The systems using this features considers only the tracking information of each subject's locations at a time.There is a lot of information extracted from human trajectories, such as (Commonly occupied region, Human walking path shape) [12].
Human Local Motion features are useful when human limb movements can be discerned in video feeds.Based on the degree of not-rigidity of the objects, human motion are classified as rigid or not rigid motions.Also there is articulated motion in which limb motions are rigid but overall motion is not rigid [13].
Categorization of human motion approaches is based on with/ without prior knowledge about the object shape.These category listed as follows: • Model based approaches: Using prior knowledge about the shape of an objects [14].
• Appearance based approaches: Building body representation in a bottom-up fashion by first detecting appropriate features in a sequence of frames.Silhouettes and interest points are some examples of these features.These approaches do not require a specific object model but are sensitive to noise [15,16] used to search an action contained in the short query video over a large resolution without using background subtraction and tracking of the object [16,17].

Interest Points Features
Interest points (salient points) are the local spatio-temporal features in a video segment that is detected by using an approach to detecting and matching these points in the video segment in both the temporal and spatial domains.It is detected when there is a region producing high response values.The interest points features are formed by the interest points and their surrounding neighborhoods.So it can be able to describe the action captured in a video segment [18].In this paper interest points features are used by utilizing HARRIS or FAST detector, for more information about these detectors see [19,20].

Data Stream Clustering and Seed Based Region Growing Technique
A data stream model is defined as an ordered sequence of points x1, … , xn where ( ≈ ∞).
The sequence has to be read in order and once or a small number of times [21].
The data stream model requires decisions to be made before all the data becomes available.This model is similar to online models.So these models need algorithm (data stream algorithm).
These algorithms are allowed to take action after a group of points arrives [21].
Seed Based Region Growing method is Segmentation technique.The basic idea in this method is to group a collection of pixels to form a region based on some similar properties; these properties may include intensity, texture or color, as shown in figure (1) [22,23].

Growing Process after a Few Iterations
Data stream and online clustering approaches are similar in that both of them require decisions made before all data are available.But these models are not identical because online algorithm can access the first i data point (with its previous i decisions) when reacting to the (i+1) th point, the amount of memory available to data stream algorithm is bounded by a function of the input size (sublinear function used).Also, a data stream approach not be required to take action after the arrival of each point (after a group of points) [18].Data stream clustering approaches store and processes large scale data efficiently because it provides summarizations of the past data, see [24,25,26,27,28,29] for more information.In this work seed filling clustering algorithm is utilized to form new data stream algorithm.

Design of Data Stream Clustering Algorithm Based on Seed Filling Technique
Anomaly behavior detection systems data stream clustering algorithm based on seed filling technique are created were trained and applied to the same datasets.Their performance were measured using (Detection Rate (DR), False Alarm Rate (FAR), Recall (R), Precision (P), The Coverage Test (CT)), for more detail about these measurements see [30,31].

The Proposed Systems Layouts
The diagram for the proposed anomaly detection system is shown in figure (2).

Preprocessing
Converting to gray preprocessing applied on both training and test datasets to make the data of both training and test sets more suitable and easier to analyses.Preprocessing is a necessary stage when the requirements are typically obvious and simple.In this step the frames of video dataset are converted to gray image: basically each pixel in extracted color sub-image has three components of color (red, green, blue), the value of each color component is represented by one byte.The gray value to each pixel is computed by using average method and the process of computing values of all pixels leads to convert the color frame to gray frame.

Feature Generation
From the training and test video clips of dataset an interest points information's such as: interest points pairs and (distance, direction, x-coordinate, y-coordinate) of these pairs are extracted.This done in feature detection and feature matching phases by utilizing interest points detectors HARRIS or FAST to extract interest points from the current frame and previous frame (according to predefined threshold PFT) and then matching these interest points from both frames to obtains list of pairs of interest points.The process illustrated in (1, 2) algorithms.
After that an extraction process will be done to estimate list of features (distance, direction, xcoordinate, y-coordinate) from these pairs, see figure (3).This illustrated in algorithm (3).

Proposed Data Stream Clustering
In this paper a data stream model is defined as four features of each pairs of interest points which are obtained from the detection and matching feature phase.The proposed algorithm is used to make decision after a group of these features arrived and then these decisions (past data of clusters) are summarized and organized in a database; that gives a summarization of all clusters in limited memory size.This system utilizes seed based region growing technique as

Second Proposed Data Stream Clustering System Algorithm Steps
Step1: Read Video.
Step2: Convert color frame to gray frame.
Step5: Clustering list of features by algorithm (5) using Seed filling.
Step6: If training phase then Create Database_Clusters using algorithm (8) Else Detect and localize anomaly behavior using algorithm (9)

End If
Starting with reading video clips from datasets (the training videos in training phase) or (testing videos in testing phase) to obtain list of video frames and number of frames, continuing by converting these frames to gray frames.And then extract list of interest pairs by applying interest points detector (HARRIS or FAST) as shown in (1 or 2) algorithms.After that extract list of interest points features such as: distance, direction, x-coordinate, y-coordinate, using (3) algorithm.The next step is clustering these extracted list of interest points features by investment seed based region growing technique in data stream clustering algorithm, this is done based on coordinates, as seen in ( 5) algorithm.In algorithm (5) test all pairs in list of pairs will be done, if the pair is not classified under any cluster, then it will be classified as new cluster and added to the list of classified neighbor pairs to be tested through computing the closest neighbor to it in list of pairs by using Euclidian distance, as shown in algorithm ( 6 where the coordinate threshold is used to determine the neighbor when distance is less than it. And then, test all neighbor pairs if it is not classified to any cluster then it will be classified as the cluster of the tested pair and added to the classified neighbor pairs list to examine the neighbor pairs for it.After completing the test of the neighbor pairs list and there is more than one pair in the set a computation of new centroids for these neighbor pairs will be done and these new centroids are added to the list of centroids otherwise will be discarded and removed from the neighbor pairs list.The computation of new centroids will be done by allocating minx, miny, maxx, maxy for each group of neighbor pairs in the same cluster to determine the coordinates of the new cluster, also the average of distance and direction will be computed, as shown in (7) algorithm.In enrolment phase the cluster database will be created in the clustering process, see algorithm (8).While in the anomaly detection and localization phase, the anomaly event and its location will be detect with help of previously database created.

Anomaly Behavior Detection and Localization
After completing the construction of cluster database from passing the training video dataset, it will be used in anomaly behavior detection in order to detect anomaly behavior in test video dataset and localize its position; algorithm (9) describe the anomaly behavior detection and localization.In this algorithm, the class obtained from the proposed data stream clustering algorithm in test phase and then search the database of clusters; that is obtained from the training phase; to find if this class is in the database, so it will be detected as anomaly event and its location in the frame is marked by using the coordinate from the clustering data stream process.The experiment was repeated 15 times, and the values of the used data stream clustering parameters shown in table (2).Table (3) illustrate the anomaly detection test results of UCSD (Ped1) video clips sets using features extracted from the interest points pairs as an input to the anomaly behavior detection system in both method of feature extraction using HARRIS or FAST detectors and the detail of the detection system are presented in tables (4, 5) using UCSD (ped2) dataset and VIRAT dataset respectively.
[ Table (2) The proposed data stream clustering parameters.The test results show that the proposed anomaly detection system with HARRIS detector is capable of recognizing the anomaly event with average detection rate (9.09%, 52.17%, 61.67%) for the test video clips set from the three dataset respectively, the average false alarm rate was

FA S T D E T E C T O R
. Appearance based approaches are classified into three types: Flow based approaches, Spatial-temporal shape template base approach, and Interest point based approaches The interest points extracted by interest points detectors.It is extension of key point's concept for object detection in images.One of the advantage of using interest points is that they can be Vol: 13 No:4 , October 2017 DOI : http://dx.doi.org/10.24237/djps.

Figure ( 1 )
Figure (1): Example of region growing.(a)Start of Growing a Region (b)

Vol: 13
No:4 , October 2017 DOI : http://dx.doi.org/10.24237/djps.1304.323BP-ISSN: 2222-8373 E-ISSN: 2518-9255 2. Video Dataset The selected video datasets are two publicly available datasets.The first is the UCSD dataset: pedestrians' datasets (ped1) and (ped2); and the second is VARAT dataset.The entire set of the selected video dataset is divided into two sets of videos: (i) a training video set is used for the developments of a system, and (ii) a test video set contains videos used to measure the anomaly behavior detection performance of the system.

Figure ( 4 )
illustrates the process of the second proposed data stream clustering algorithm based on coordinates.

Figure ( 6 )Figure
Figure (6): The bar chart result of proposed anomaly detection system using HARRIS detector