Pedestrian Detection under Parallel Feature Fusion Based on Choquet Integral

Feature-based pedestrian detection method is currently the mainstream direction to solve the problem of pedestrian detection. In this kind of method, whether the appropriate feature can be extracted is the key to the comprehensive performance of the whole pedestrian detection system. It is believed that the appearance of a pedestrian can be better captured by the combination of edge/local shape feature and texture feature. In this field, the current method is to simply concatenate HOG (histogram of oriented gradient) features and LBP (local binary pattern) features extracted from an image to produce a new feature with large dimension. This kind of method achieves better performance at the cost of increasing the number of features. In this paper, Choquet integral based on the signed fuzzy measure is introduced to fuse HOG and LBP descriptors in parallel that is expected to improve accuracy without increasing feature dimensions. The parameters needed in the whole fusion process are optimized by a training algorithm based on genetic algorithm. This architecture has three advantages. Firstly, because the fusion of HOG and LBP features is parallel, the dimensions of the new features are not increased. Secondly, the speed of feature fusion is fast, thus reducing the time of pedestrian detection. Thirdly, the new features after fusion have the advantages of HOG and LBP features, which is helpful to improve the detection accuracy. The series of experimentation with the architecture proposed in this paper reaches promising and satisfactory results.


Introduction
Pedestrian detection is the key technology of intelligent transportation [1][2][3]. In addition, the core technologies included in pedestrian detection are also indispensable for other applications, such as, robotics, video surveillance and behavior prediction [4][5][6]. In recent years, researchers have proposed many different pedestrian detection methods and successfully applied them in commercial and military fields [7][8][9][10][11]. Feature based pedestrian detection method is the mainstream method at present. Although they are different in the processing of raw data and the training of classifier, they basically follow the similar path as shown in Figure 1. The input of this path is the original image in the form of pixel representation, while the output includes a set of rectangular borders with different sizes. Each rectangular border corresponds to a pedestrian identified in the image. A typical pedestrian detection scheme mainly includes three steps: selection of detection region, extraction of feature and classification of detection region.
In the stage of selection of detection region, the input is usually the original image, and the output is a group of regions with different sizes and ratios. Sliding window method is the simplest method among all the region selection algorithms. It can be used to obtain regions with multiple proportions and aspect ratios. As the number of candidate regions has a great influence on the speed of the whole pedestrian detection system, more complicated approaches analyze the original images in advance to filter out the regions in which no target objects are believed to contain, and therefore the number of candidate regions to be tested is reduced. For the extraction of features, the input is the candidate region that may or may not contain a pedestrian, and the output is a feature vector in the form of real-valued or binary-valued. The criteria of which features should be extracted are whether they can classify pedestrians and non-pedestrians. Feature extraction can be clustered as single and multifeature extraction, respectively. Single features typically include HOG (histogram of oriented gradient) [12], LBP (local binary pattern) [13] and Haar-like [14], while the representatives of multifeature are HOG-LBP [15], HOG-Harr-like [16] and HOG-SIFT (scale-invariant feature transform) [17,18].
In the stage of classification of detection region, the main task is to identify whether there is a human shape in the candidate region. The feature vector obtained in the feature extraction stage for a candidate detection region is input into the classifier, and a binary label is output after classification calculation to indicate whether the area is positive (that means containing pedestrian) or negative (that means not containing pedestrian). The classical classifier comprises SVM (support vector machine) [19,20], AdaBoost [21,22] and CNN (convolutional neural network) [23,24].
Compared to the methods of multi-component combination, the implementation process and structure of feature-based methods of pedestrian detection is relatively simple. When using different feature methods, it does not need to change the original architecture and consequently guarantees better portability. The critical steps of the feature-based pedestrian detection pipeline mentioned above are feature extraction and region classification. Therefore, a novel and efficient feature extraction algorithm is essential, and which inspires the origin of this paper.
HOG [12] is widely considered as one of the best features to obtain edge or local shape information. It has achieved great success in target recognition and detection [25][26][27]. In [28], Zhu et al. integrate the cascade rejections method to accelerate the HOG extraction process in human detection. In [29], HOG-DOT algorithm with L1 normalization technique and SVM is used. Although this method can get good TPR (true positive rate), its FPR (false positive rate) is very high. HOG using discrete wavelet transform is proposed in [30], but its detection rate is not high, only 85.12%. In [31], selective gradient selfsimilarity (SGSS) feature is applied for feature extraction with HOG. The addition of SGSS significantly improves the accuracy of pedestrian detection, and the detection ability of cascade structure based on AdaBoost is better than linear SVM or HIKSVM.
It should be noted that the performance of HOG is poor in the background clustered with noisy edges. In such a situation, LBP [32] can play a very good complementary role. LBP feature has been widely used in different applications and has presented satisfactory performance in face recognition. It is a very effective feature to distinguish images because of its invariance to monotonic gray level changes and high efficiency of computation.
Based on the above reasons, it is natural to think that the combination of edge/local shape information with texture information can capture the appearance of pedestrians more efficiently. In [15], aiming at the problem of partial occlusion, a feature descriptor based on serial fusion of HOG and LBP features is proposed. Although the detection rate has been improved, the detection efficiency is sacrificed owing to the increase of dimension. In [33], Jiang et al. also use concatenation method to combine HOG and LBP features to for a new feature vector, and then send it to XGBoost (eXtreme Gradient Boosting) classifier. Therefore, the problem of sacrificing detection efficiency to improve detection accuracy still exists. In the feature fusion of pedestrian detection, the current mainstream method is to concatenate several features in series. This approach may raise two problems. First, serial feature concatenation leads to an extreme increase of the number of features for the whole image being processed and consequently affects the processing speed. Secondly, the processing method of concatenating features does not consider the possible interaction between features and the possible impact of this interaction on the final classification decision. Therefore, the current feature fusion based on concatenation cannot be regarded as the feature fusion in a strict sense.
In this paper, Choquet integral based on the fuzzy measure is applied to realize the parallel fusion of HOG and LBP feature descriptors. This methodology is expected to improve the detection accuracy without increasing the feature dimension. The Choquet integral based on fuzzy measure is a very effective feature fusion method. When it is applied to the fusion problem, the fuzzy measure in the integral can well reflect the importance of each feature to the fusion target and the influence of the interaction among features on the fusion target. At the same time, it may help us to mine the possible interaction between different pedestrian descriptors, which has a positive research significance for the development of pedestrian detection technology.
The procedure of pedestrian detection based on the parallel fusion of HOG and LBP features is demonstrated in Figure 2. Here, HOG features and histogram of LBP descriptors of each cell are extracted from original image, respectively. They are parallelly fused by Choquet integral with its internal parameters, i.e., values of fuzzy measure, being optimized by a genetic algorithm. This fusion results in a new set of features, called parallel-HOG-HOLBP (histogram of gradient-histogram of local binary patterns), which is consequently transmitted to SVM for classification. The intervention of Choquet integral makes the two features (HOG and LBP) merge in parallel. The resulting feature, parallel-HOG-HOLBP, not only contains the original advantages of HOG and LBP, but also avoids the unavoidable dimension disaster in traditional serial fusion. Genetic algorithm is used to optimize the interval coefficients of fuzzy measure in Choquet integral. It is a more rational way to retrieve these parameters through a global optimization algorithm compared to through trial and error method. We designed a series of experiments to verify the effectiveness of the proposed method. The experimental results show that the proposed method has better comprehensive performance than the existing methods.
The organizational structure of this paper is as follows. In Section 2, typical features used in this paper are presented. Aggregation of feature fusion based on Choquet integral is introduced in Section 3. An adaptive algorithm based on genetic algorithm is implemented in Section 4 to optimize the internal coefficients of fuzzy measure in Choquet integral. In Section 5, experimental results and analysis are demonstrated. Finally, Section 6 summarizes and prospects this paper.

Features Realignment
When using Choquet integral to fuse two kinds of features in parallel, the fused features should have the same dimension. Therefore, this section discusses the mechanism of feature realignment of HOG and LBP features.

Histogram of Oriented Gradient Feature Extraction
HOG feature calculates the distribution of gradient in local image, so it can describe the edge or local shape information of object well. A typical HOG feature extraction process includes four steps:

1.
Standardize gamma space and color space.
In order to reduce the influence of illumination, the whole image needs to be normalized. In the texture intensity of the image, the local surface exposure contributes a large proportion, so this kind of compression can effectively reduce the local shadow and illumination changes of the image. As the color information has little effect, the original RGB image is usually converted to a gray image, and the gamma correction is used to normalize it by formula where I(x, y) represents the intensity of the pixel at coordinates (x, y), and γ represents the parameter of gamma correction. Generally, the value of γ is set to 0.5.
The gradient of horizontal direction G x (x, y) and the gradient of vertical direction G y (x, y) are, respectively, calculated for the normalized image.
The gradient value G(x, y) and gradient direction θ(x, y) of each pixel are calculated from the gradient of the two directions, respectively. 3.
Construct the histogram of gradient direction for each cell.
The image is divided into several cells, as shown in Figure 3a, each cell is 8 × 8 pixels.
The gradient direction of 360 degrees is divided into nine ranges averagely (Figure 3b), and the histogram corresponding to these nine bins is constructed to count the gradient information of the 8 × 8 pixels. The horizontal axis of the histogram is the nine bins in gradient directions, while the height of each bin is the superposition of the gradient value of those pixels whose gradient directions belong to the bin.

4.
Construct the HOG feature for an image.
Each cell gets a 9-dimensioanl vector. As shown in Figure 3a, four adjacent cells constitute a block, and the vectors of four cells in a block are connected in serial to obtain a 36-dimensional vector. The block is used to scan the image with the scanning step as a cell. Finally, the vectors of all blocks are connected in serial to get the HOG feature of the image. For example, for an 128 × 64 image, every 8 × 8 pixel constitutes a cell and every 2 × 2 cells constitute a block. As each cell has nine features, there are 4 × 9 = 36 features in each block. Taking eight pixels as the step size, there will be seven scanning windows in the horizontal direction and 15 scanning windows in the vertical direction. In other words, 128 × 64 images have 36 × 7 × 15 = 3780 features.

Histogram of LBP Descriptor
The LBP descriptor shows the difference of gray-level between a pixel in center and its neighbor pixels in a specific size region. If we denote the gray value of a pixel as I(x, y), then the LBP value of this pixel is a decimal calculated by where Here, K is the number of neighbor pixels around the center pixel. Figure 4 shows an LBP feature extraction process with K = 8 and radius as 1. In order to fuse HOG and LBP in parallel by Choquet integral, the two features extracted from a candidate image should have the same dimension. Referring to the construction method of HOG features, we realign the LBP feature for the candidate image. The gradient value and gradient direction of LBP value for each pixel are calculated, respectively. The gradient direction of 360 degrees is divided into nine ranges averagely and the histogram corresponding to these nine bins is constructed to count the gradient information of each cell. The horizontal axis of the histogram is the nine bins in gradient directions, while the height of each bin is the superposition of the gradient value of those pixels whose gradient directions belongs to the bin, as shown in Figure 5. Similarly, for an 128 × 64 image, a new feature vector with length 3780 is constructed. We called this new feature vector histogram of local binary patterns (HOLBP).

Feature Fusion in Parallel by Choquet Integral
Since the dimensions of HOG and HOLBP are consistent, it is possible to fuse these two feature descriptors in parallel. In this paper, Choquet integral based on fuzzy measure [34,35] is utilized as an aggregation tool to perform the fusion task.

Signed Fuzzy Measure
Denote X = {x 1 , x 2 , · · · , x n } as a set of feature attributes being considered. The set of all the subsets of X is called the power set of X and is denoted by P(X).

Definition 1. A signed fuzzy measure is a set function
A signed fuzzy measure µ assigns a real value for each single element and each possible combination of elements in X. If we regard the elements in set X as a set of features to be fused, then the signed fuzzy measure values corresponding to each feature and the signed fuzzy measure values corresponding to each possible combination of each feature can be clearly explained as their influence on the fusion target. Due to the nonadditivity of the signed fuzzy measure, the influence of any combination of features to be fused on the fusion target is not the simple sum of their respective influences. Therefore, the signed fuzzy measure defined on set X has interpretable physical meaning, indicating the possible interaction between the features to be fused.
A signed fuzzy measure has advantages of describing the individual and joint contribution rates from features to be fused toward the fusion target flexibly. A signed fuzzy measure µ is called subadditive if it satisfies µ(A ∪ B) ≤ µ(A) + µ(B) whenever A, B ⊂ X, while a signed fuzzy measure µ is called super-additive if it satisfies µ(A ∪ B) ≥ µ(A) + µ(B) whenever A, B ⊂ X and A ∩ B = ∅.

Choquet Integral as Aggregation Tool
Definition 2. Let µ be a signed fuzzy measure defined on P(X). For a real-valued function f : X → (−∞, ∞) , its Choquet integral is defined as where F α = {x| f (x) ≥ α} is a set whose elements have their function values greater or equal to α, α ∈ (−∞, ∞).
The values of f for each element are denoted as f (x 1 ), f (x 2 ), · · · , f (x n ). To calculate the value of a Choquet integral with a given function f , they are usually sorted in a nondecreasing order such as f (x 1 ) ≤ f (x 2 ) ≤ · · · ≤ f (x n ). Here, (x 1 , x 2 , · · · , x n ) is a certain permutation of {x 1 , x 2 , · · · , x n }. Then, the value of the Choquet integral can be obtained by where f (x 0 ) = 0. In real programming, it is inconvenient to perform such a sorting operation in Equation (9). Actually, the value of the Choquet integral of f with respect to µ can be calculated as a linear form as follows.
in which The definition of Choquet integral shows that it is actually a mapping from n-dimensional space to a real value, so it is usually regarded as a powerful tool to aggregate different features, and the result of aggregation are furthermore to be used for the solution of data classification or regression problems.

Feature Fusion by Choquet Integral
For a sliding window, each corresponding singleton dual of HOG feature and HOLBP feature constructs a set of feature attributes, denoted as X = {x 1 , x 2 }. A real-valued function f is defined on X by assigning f (x 1 ) and f (x 2 ) the numerical value of the corresponding singleton of HOG feature and HOLBP feature, respectively. A signed fuzzy measure µ is defined on P(X) to describe the influence of each individual feature as well as each possible combination of features to the fusion target. Since µ(∅) = 0, in this case, three values of µ are required to be set, that is, µ({x 1 }), µ({x 2 } and µ({x 1 , x 2 }). Figure 6 illustrates the process of feature fusion of HOG and HOLBP by Choquet integral.

Pedestrian Detection Framework with Parameters Retrieved by Genetic Algorithm
To accomplish the parallel feature fusion between HOG and HOLBP via Choquet integral based on signed fuzzy measure, a series of interval parameters, i.e., µ({x 1 }), µ({x 2 } and µ({x 1 , x 2 }), need to be retrieved. Of course, we can use trial and error method to predict the values of these parameters. However, a more scientific way is to retrieve these parameters through a global optimization algorithm.
As shown in Figure 7, genetic algorithm is an adaptive optimization algorithm which can ensure global search. It includes the process of initialization of new generation, evaluation of each individual of population, selection, reproduction (crossover), and mutation.

Parameters Retrieving under Genetic Algorithm Framework
In the genetic algorithm of parameter retrieving, each individual of a chromosome represents a set of signed fuzzy measures. Due to binary coding, each chromosome is composed of 30 genes (10 genes corresponding to a parameter to be optimized). The value of each gene is a binary number. Each chromosome is decoded as three real values between 0 and 1, corresponding to the normalized value of µ({x 1 }), µ({x 2 } and µ({x 1 , x 2 }). The fitness value of each chromosome is evaluated by the AUC (area under curve) of ROC (receiver operating characteristic) curve. Since the probability that a chromosome in a population can be selected to generate offspring depends on its fitness value, the pedestrian detection parameter optimization algorithm based on genetic algorithm takes the maximum AUC as the criterion to optimize. Figure 8 shows the process diagram of parameter retrieving process based on a GA structure under the application of pedestrian detection. The algorithm starts with a randomly generated initialization population. Each individual of chromosome in the population is decoded into a set of values, which is actually a representation of a specific signed fuzzy measure. Typical HOG feature extraction and HOLBP feature construction presented in Section 2 are performed. The two sets of features are fused by the Choquet integral with respect to the specific signed fuzzy measure represented by the corresponding individual chromosome in the current population. The fusion results are a new set of features, called parallel-HOG-HOLBP, which is consequently transmitted to SVM for classification. The same process is done for each sliding window of the images in INRIA data set [36]. AUC is calculated from the ROC curve which is constructed for each individual of the population. The value of AUC is used to evaluate the fitness value of the chromosome being considered. Then, a tournament selection is conducted. Better individuals (with higher AUC values) have more opportunities to perform several randomly chosen genetic operators to produce offspring. The population is updated by the newly created offspring. This process is repeated until the number of individuals generated exceeds the preset maximum size of population. In the process of program iteration, in order to keep the global search space, some special operations are used when the best fitness value remains unchanged for successive generations (the default value is 20). The individuals in the original population are divided into three parts according to the ascending order of fitness value. The excellent individuals in the first part are all retained, the individuals in the second part produce new offspring by random mutation, and the individuals in the third part are randomly replaced by the new individuals produced by previous genetic operations. As a result, the population is updated and the program continues to iterate.

Classifier Training
Each chromosome corresponds to a signed fuzzy measure. Based on each signed fuzzy measure, a Choquet integral fuses HOG and LBP features in a candidate image, and the generated parallel-HOG-HOLBP features are sent to an SVM classifier to evaluate the performance of the chromosome according to the classification results on a set of testing images.
Based on the principle of structural risk minimization, support vector machine has a very powerful ability in dealing with nonlinear problems. The algorithm uses learning samples to find an optimal hyperplane in high-dimensional space, so as to separate different samples from two groups. First, the parallel-HOG-HOLBP features of positive and negative samples are calculated as input of the SVM classifier. Then the final decision function is calculated as where φ : X → F is a nonlinear mapping from the input space to a high-dimensional feature space. f (x) is optimal in the sense of maximizing the distance between the hyper-plane and the nearest point φ(x i ). The following equation is usually used to solve optimization problem mentioned above.
where ξ i is a slack variable, which corresponds to the vertical distance from each wrongly classified sample point to the corresponding boundary hyperplane. Parameter C is the penalty coefficient. The larger this parameter is, the more severe the penalty is. In this paper, INRIA data set [36] is selected as the training set to train SVM classifier, because INRIA data set is a benchmark data set which is widely used in pedestrian detection. We extract positive samples from INRIA training set according to the pedestrian coordinates marked in the dataset, and construct negative samples from the training set by randomly cropping. After extraction, the training set being used consists of 2416 positive samples and 12,180 negative samples from the INRIA dataset. The parameters of SVM classifier are shown in Table 1.

Classifier Training and Evaluation Criterion
To evaluate the classifier constructed by each chromosome in the current iteration, 1126 positive samples and 453 negative samples are extracted from the INRIA testing set. For each chromosome, a confusion matrix is summarized by four indicators, as shown in Figure 9. The indicators represent four situations:

1.
The actual value is true, and the classifier assigns it to be positive (True Positive = TP); 2.
The actual value is true, and the classifier assigns it to be negative (False Negative = FN); 3.
The actual value is false, and the classifier assigns it to be positive (False Positive = FP); 4.
The actual value is false, and the classifier assigns it to be negative (False Negative = TN).
Three new indicators are sequentially calculated. They are and F1 Score = 2· precision·recall precision + recall (16) Here, indicator precision and indicator recall describe the classifier's correct predictions as a percentage of all results, where indicator F1 Score takes into account that the destination of optimization is to find the best combination of precision and recall. In our algorithm, indicator F1 Score is utilized as the fitness value of each chromosome in iterations.

Data Construction
This paper uses INRIA pedestrian data sets to construct a training set and testing set. We extract positive samples from INRIA training set according to the pedestrian coordinates marked in the dataset, and construct negative samples from the training set by randomly cropping. After construction, the training set consists of 2416 positive samples and 12,180 negative samples, where the testing set consists of 1126 positive samples and 453 negative samples.

Experimental Results and Analysis
In order to validate the performance of parallel-HOG-HOLBP features and relevant GA-based pedestrian detection algorithm proposed in this paper, four classifiers with different combinations of features are selected to be tested on the same set of testing set. They are: 1.
SVM classifier with serial fusion of HOG and LBP features, denoted as HOG-LBP-SVM; 3.
SVM classifier with parallel-HOG-HOLBP features whose fusion parameters are set by experience, denoted as HOG-HOLBP-SVM; 4. SVM classifier with parallel-HOG-HOLBP features whose fusion parameters are optimized by GA process, denoted as HOG-HOLBP-GA-SVM.
The experimental results of HOG-SVM and HOG-LBP-SVM are expressed as confusion matrices shown in Tables 2 and 3.
Using Choquet integral to fuse HOG and HOLBP in parallel, suitable values of signed fuzzy measure are extracted because they are the essential parameters in Choquet integral. Their values directly affect the effectiveness of the subsequent pedestrian detection. In experiments of HOG-HOLBP-SVM, 10 groups of signed fuzzy measure are chosen by experience. Their performances are validated by F1 scores and shown in Figure 10. The F1 score reaches the best result with µ({x 1 }) = 0.45 and µ({x 2 }) = 0.10. HOG-HOLBP-SVM experiment based on this best combination is conducted and the detection results are expressed as the confusion matrix shown in Table 4. Keeping the dataset unchanged, the GA-based feature fusion and pedestrian detection algorithm (HOG-HOLBP-GA-SVM) is used as a classifier for training and testing. Parameters of the signed fuzzy measure of Choquet integral to accomplish the feature fusion in parallel are optimized by the genetic algorithm during the iteration process, as explained in Section 4.1.
Under the premise of the same running parameters of genetic algorithm, HOG-HOLBP-GA-SVM was run for 10 trials. The results of these 10 trials are recorded in Table 5, in which the minimum fitness value, the maximum fitness value and the average value at the end of each trial are recorded in the corresponding rows of each run. As shown in Table 5, among the 10 randomly generated trials, Trial 3 gives the best optimization result, that is, at the end of running, the maximum fitness value reaches 0.9758. In Trial 3, an optimization set of parameters (µ({x 1 }) = 0.382, µ({x 2 }) = 0.174) is obtained at the end of iteration. The standard deviations of the three measurements for the 10 trials are also shown in the bottom row of Table 5. The optimization process of trial 3 is shown in Table 6. The confusion matrix of HOG-HOLBP-GA-SVM experiment on this trial is shown in Table 7. For the remaining trials, HOG-HOLBP-GA-SVM can also reach into the nearby space of the optimized point. This shows that HOG-HOLBP-GA-SVM has a satisfactory performance on the efficiency and effectiveness.  To compare the performance of four classifiers, the results in the confusion matrix (Tables 2-4 and 7) are converted to three indicators, i.e., precision, recall and F1 score. The experimental results of four methods are shown in Table 8, where precision, recall, F1 score, and feature extraction time are reported. As shown in Table 8, three indicators of detection rate of the two parallel feature fusion methods (HOG-HOLBP-SVM and HOG-HOLBP-GA-SVM) are superior to those of HOG-SVM and serial feature fusion method (HOG-LBP-SVM). In addition, there is an obvious slowdown in the feature extraction time, from 88.285 and 131.854 ms per frame to 10.075 and 10.126 ms per frame, respectively. The reduction of execution time shows that the parallel feature fusion algorithm has better realtime performance than the serial feature fusion algorithm. Figure 11 shows the comparison from the precision, recall and F1 score of the four algorithms on testing set and a summary of performance comparison among four algorithms is depicted in Figure 12.    A ROC (receiver operating characteristic) curve was drawn with false positive rate as horizontal coordinate and true positive rate as vertical coordinate to estimate the influence of sample distribution on the performance of the algorithm.
The larger the area surrounded by the ROC curve, the better the performance of the classifier. Figure 13 shows the corresponding ROC curves of HOG feature, HOG-LBP feature, HOG-HOLBP feature, and HOG-HOLBP-GA feature with SVM being used as classifiers, respectively. It can be seen that HOG-HOLBP-GA-SVM classifier has better performance than other classifiers apparently.
In addition, according to the detection results of each algorithm, FPPI (false positive per image) curves are drawn, respectively. FPPI curve represents the average number of correct retrievals in each image, and its value is closer to the practical application of the classifier. The lower the curve in the graph, the stronger the performance of the corresponding model. The seven FPPI curves shown in Figure 14 compare the proposed pedestrian detection algorithms with other popular pedestrian detection algorithms. It can be seen from the figure that HOG-HOLBP-GA-SVM classifier has achieved good performance.

Conclusions
The key issues of pedestrian detection are to extract efficient features so as to accomplish detection correctly and promptly. This paper attempted to present a novel parallel framework and solve these problems with Choquet integral being involved. The intervention of Choquet integral makes the two features (HOG and LBP) merge in parallel. The resulting feature, parallel-HOG-HOLBP, not only contains the original advantages of HOG and LBP, but also avoids the unavoidable dimension disaster in traditional serial fusion. Genetic algorithm is used to retrieve the interval parameters of fuzzy measure in Choquet integral. It is a more rational way to retrieve these parameters through a global optimization algorithm compared to through trial and error method. We conducted extensive experiments to demonstrate that the proposed method has more effective characteristics compared with the original methods. Our research reveals that feature fusion in parallel is an effective and promising way to improve pedestrian detection performance.