Location Strategy for Traffic Emission Remote Sensing Monitors to Capture the Violated Emissions

Air contamination becomes an urgent problem to be considered as a result of the rapid growth in traffic all over the world. Traffic emissions differ from vehicle to vehicle depending on the vehicle type, production year, fuel octane number, and periodical maintenance of the vehicle. The majority of drivers do not revise their harmful vehicles emissions regularly. Therefore, effective tracking of high-emitting vehicles can be an important solution for reducing traffic air pollution. This study proposes a location strategy for vehicle remote sensing monitors aided with ID-plate recognizer to capture any violated vehicle emissions. The problem is formulated into a graph theory problem, and then a novel adapted metaheuristic algorithm is used to solve the problem. The methodology, using a benchmarkproblem, hasmanaged to solve the problem to the optimality.Moreover, its robustness ismeasured statistically.


Introduction
Since the middle of the past century, traffic has grown tremendously all over the world.This growth led to many positive consequences for people social and economic aspects.However, the environment was prone to the major impact.On-road motor vehicles are one of the largest contributors to air pollution in urban environments [1].Vehicle emissions contain a vast number of pollutants, some with toxicological substances [2].
Therefore, comprehensive regulations and incentive programs have been set to reduce vehicle emissions despite increased economic load on the drivers [3].In order to enforce these regulations to mitigate air pollution, the monitoring of on-road vehicle emissions is really an urgent task.The exhaust from vehicles contains carbon monoxide (CO), hydrocarbons (HC), nitrogen oxide (NO), particulate matter (PM), and many other toxic substances.Some vehicles emit these substances with high levels which exceed the allowable threshold due to the carelessness of the owners in revising their vehicles periodically or even using the appropriate type of fuel (economic aspects) [4].For example, a case study of California vehicles noted that high-emitting vehicles would account for up to 6% of vehicle population and vehicle miles traveled, yet they are expected to contribute to more than 75% of exhaust and 66% of evaporative emissions in 2030 [5].
Strict regulations with a real-time ticket for violating vehicles would force drivers to check their cars' emissions regularly.The experienced behavior of drivers towards speed violation and unfasten seat belts tickets may support this claim [6].Therefore, the existence of the flowing surveillance system on a road section may be beneficial; see Figure 1.The designed system aims to monitor simultaneously each vehicle ID and its emissions.This would help to make a record of each passing vehicle's emission and consequently detect the violated emissions [7].
Vehicle emission remote sensing is extensively studied in the past decades.It proved itself an effective surveillance tool to identify high CO emitting vehicles [8].It is also used to detect CO 2 , HC, NO, and PM emissions [9].The accuracy of such a surveillance system is reasonable after using some sort of scaling factors [10].Hence, it could be used to judge the emissions of vehicles on the deployed road section.
The stated system cost seems to be high to be provided at each link on a transport network.So, the problem arising here is how to determine the minimum number of sensor locations to monitor every vehicle on the network once at least.The problem solution would guarantee the full observation of traffic emissions through the network.Real-time observation would accelerate the detection of violated vehicles and thus ticketing policy or warning messages would force the vehicle owner to fix the deficiency that is causing this emission violation.
In this study, a novel strategy is developed to minimize the required number of emission sensors (i.e., the system proposed in Figure 1) to be installed on a network to intercept all the traffic (i.e., every vehicle once at least).The problem is formulated as a constraint integer programming model.The complexity of solving the formulated model with exact methods is shown.Therefore, a proposed strategy is presented as an alternative effective solution which depends on simple stochastic iterative rules (metaheuristics).The structure of the paper is as follows.Section 2 presents the state-of-theart.Section 3 provides the problem formulation and basic input data.Section 4 illustrates the proposed methodology.In Section 5, a real case study is used to evaluate and validate the proposed methodology.Section 6 presents the conclusion.

State-of-the-Art
The stated problem is relatively new; however, the sensor location problem (SLP) has been studied extensively for the last three decades.As a matter of logic, providing sensors on all network links is not a practical strategy.Consequently, attempts are made to specify the best number and locations of sensors to serve the objective of their installation.Therefore, there is a need to determine the purpose of sensors installing on the network and the type of sensors before developing a sensor location strategy.
In the next review, the focus would be directed to each type of sensors with respect to its application and the proposed location model.The first used type of sensors is the traffic counting sensor which is deployed to automatically count the traffic at a road section [11].The counting data of different road sections (links) is analytically analyzed to predict the Origin/Destination (O/D) matrix [12].In [13], a theoretical investigation of the reliability of an estimated O/D trip matrix from traffic counting on a portion of network links is made.This work is a pioneer of its research approach which opened the gate for many later studies to develop different SLP models in the attempt to target the most accurate O/D matrix [14].In [15], four location rules are stated as a rule of thumb to judge the quality of a number of counting sensors set if used to estimate the O/D matrix.To find the optimum sensors number and locations that achieve these rules, different metaheuristics are used such as genetic algorithms [16], greedy algorithms [17], distancebased genetic algorithm [18], compressed sensing [19], and randomized priority search [20][21][22][23][24][25].
Passive traffic counting sensors are developed to image sensors (videos sensors) that can be placed at network intersections (nodes) to monitor the traffic turning ratios.This would help in observing the flow on all links if the sensors are provided at every node.Interestingly, the number of required sensors to achieve the full flow observability could be decreased with the aid of conservation flow concept at nodes (inflow = outflow).In [14,26], mathematical formulations with both exact and heuristic solution algorithms are presented for this location problem.
Vehicle identification sensor (plate-ID recognizer) is another type of sensors which not only counts the vehicles but also records the vehicle plate-ID.This manages to track the links visited by each vehicle which helps the proposed models to reconstruct different paths.In [27,28], the sensors are placed to uniquely identify flow paths.In [29], it is proved that the same problem could be solved by a fewer number of sensors if the sensors are able to arrange the visited links chronologically.
Time and speed sensors are the logistic type of sensors which could provide real-time information about traffic conditions.In [30], the SLP is optimized for the travel time estimation on a single road section.It regards the existence of advanced communication between vehicles and the infrastructure in addition to the vehicle to vehicle connection.The assumptions used within the formulated model enhance the results dramatically.
Although extensive research has been conducted to analyze the counting sensors location problem, only a few analytical studies dealt with emission sensor location problem.To the best of our knowledge, there is no study oriented to solve this problem except [31] in which the problem of emission remote sensing monitors is addressed graphically.Thus, there is a need to develop new mathematical formulations and solution algorithms which may provide better performances in solving the novel topic.
The contribution of this work is concluded in the following points: (ii) A novel metaheuristic algorithm is developed to make the model applicable for large scale-networks.
(iii) Besides, the methodology has managed to solve the case study to the optimality (compared with an exact method solution), and it gives a good statistical performance.

Problem Formulation
. .Input Data.Considering a given directed road network, G = (V, L), where V is the set of vertices which are connected by the set of links  = { 1 ,  2 , . . .,   , . . .,   }.There is a set  = {1, 2, . . ., , . . .} containing every demand node pair (i), i.e., every pair of vertices with interchangeable flow.Each link is weighted with its travel time.The network path enumeration is a requirement for the developed sensor location model.The k-shortest path is used to generate all possible paths between each node pair [26].The algorithm termination criterion is set to a defined number of (k) paths or to the path with length more than 1.5 the shortest path time.According to the vast expertise of transportation planners in network loading scenarios, routes with more than 1.5 the shortest path time are considered circuitous for travelers [27].For each demand node pair (i) all (h ki ) paths are stored in the set H i .At last, a link-path incidence matrix is generated; see (1).
Path-Link incident matrix: . .Mathematical Formulation.To build up the proposed model, let us consider a small network made of eight vertices, two of which are demand node pair (O&D); see Figure 2. The node pair is connected by a numerous number of paths.It can be identified with different link combinations to intercept all these paths.The minimum number of chosen links is two ( -D and -D).Now, the problem clearly could be stated as finding the minimum number and location of links that no path can bypass.In other words, try to cover all network paths by, at least, one link of each path.
Mathematically the considered problem would be represented as follows: ..
where z a is dummy variable = 1 if the sensor is located on a link (a) and 0 otherwise;    is dummy variable = 1 if path (h mi ) contains link (a) and 0 otherwise.H i is the set of all feasible paths connecting the node pair (i).Equation ( 2) is the objective function of minimizing the number of required remote emission sensing monitors.Constraint (3) stipulates the coverage rule of chosen locations in order to monitor all Let a copy  * fl  6.
Set  ∈  * to be the set of uncovered paths:  =  ×  7.
update links weights: sort the links in descending order according their weights.10.
choose the first link with the Selection probability, if not go to the next link until picking one.11.
change the selected link randomly according to the Mutation probability then add it to  12.
update  * by deleting the covered rows by the selected link and the columns with no contributions.13. end while 14.
Choose a percentage of  randomly according to Elitism probability as a part of next iteration.16. end for 17. return  * 18. End Algorithm Algorithm 1: Pseudocode. the traffic paths.The variables' domains are defined in (4) and (5).
The presented model in (2)-( 5) is an integer constraint programming model.This formulation proved to be NPhard complexity in [32].Existing exact methods could solve it to reasonable scale; however, large scale problems still constitute an obstacle for these methods.Heuristic methods provide an alternative way for the solving process.In the next subsections, an effective metaheuristic method is driven to solve the problem.
. .Direct Search.In this section, the optimality of the direct search method is examined through a small illustrative example.For a given incident matrix between the network paths and links, shown in Figure 3, the matrix represents a network with six links and two node pairs.For each node pair, three paths are generated.If one could wisely choose links (columns) that cover paths (rows), the optimal or near optimal solution would be found.It is apparent that the wisest decision is to select link (l ) at first for installing the sensor system (the highest covering link).Three rows and four columns would be left (l and l have no contribution so they are erased).To cover the remaining rows, l , l , and l should be selected.This procedure has led to a feasible solution with four links.However, the optimal solution is three (l , l , and l ) which might be obtained with another arrangement of the selection.In the next section, the methodology follows the same selection procedure; however, some operators of perturbation are introduced not to get trapped in the aforementioned local optimum.

Methodology
The proposed methodology depends on heuristics that would assign a weight to each link according to the number of covered paths.Then a selection procedure would be guided with these weights.Three operators of perturbation through different iterations are used to make the methodology more diverse with the aim of obtaining the optimal solution.The three operators are selection, mutation, and elitism probabilities.The solution methodology steps are as stated in Algorithm 1.
The solution methodology begins with creating the incident matrix as in Figure 2.Each link is assigned a weight equal to the percentage of paths covered from the total network paths.
Step 8 updates the links weights continuously through different iterations.
Step 10 may make the selection bypass the highly weighted links aiming to enhance the resulting solutions.If the selection probability is equal to 100%, the algorithm tends to choose the highest link weight in every selection time.Lowering this value makes the selection move in the vicinity of the highest links.In step 11, the high mutation probability makes the algorithm more random; however, it may help not to get stuck in local optima.
Step 12 reduces the matrix for the next iteration.
Step 15 transfers a part of the current solution to be a part of the next in the attempt to obtain new start for link selection.
The interesting part of the developed methodology is that it combines similar operators from two popular metaheuristics, namely, simulated annealing (SA) [33] and genetic algorithms (GA) [34].The selection probability resembles the annealing property in permitting the algorithm to direct towards the not best move aiming to escape the local optimality if trapped.Mutation probability is exactly found in the GA.The elitism probability here allows exchanging the information among the different iterations like the crossover in the GA.In the next section, these operators are examined through sensitivity analysis and the overall methodology statistical robustness is measured by ANOVA test.

Experimental Study
To validate the proposed methodology, a real network of Sioux Falls is adopted.It is first introduced in [15], and then it is considered a benchmark problem for most of the SLP literature [29,35,36].The network is made of 182 O/D pairs, 76 links, and 24 vertices.Figure 4 depicts the Sioux Falls network.The shaded nodes represent O/D node pairs (i).Table 1 gives the calibration process for the different parameters.Solving the network of this size has taken less than a minute on a workstation with two Intel5 Xeon5 Processor E5530, 12 GB RAM and 2.40 GHz.
Figure 5 shows the results of the methodology through a number of iterations equal to 100.Although there are variations among the results, the methodology maintained a small value for the coefficient of variation (C=0.067).The methodology managed to separate all the demand node pairs with a minimum number of sensors equal to 45 whose distribution structure is shown in Figure 4.The solution represents nearly 60% of the network links.For the Sioux Falls, (2)-( 5) are solved to the optimality.Interestingly, when the branch and bound technique presented in [37] is used, the results of the exact method and the proposed heuristic are matched which gives more reliability to the methodology.
There are different parameters which are responsible for the diversity of the search process.Figures 6-8 depict the impact of each operator on the quality of final results.In Figure 6, it is obvious that the selection probability equal to 100% (direct search procedure) does not help in getting the optimum solution and also very low values.The best calibration values range from 80% to 100%.Figure 7 shows that the results are very sensitive to the mutation probability; however, setting low value may improve the results.Figure 8 states that keeping a reasonable percentage of the solution set through different iterations is beneficial.The best values for elitism operator range from 15% to 25%.
The path enumeration is a critical issue when it comes to the proposed model that depends on path/route enumeration.For even a small network, a large number of paths could be generated for each demand node pair.This would lead to a combinatorial problem for the methodology.Fortunately, it is common for 3 or 4 paths to carry the vast majority of the traffic, and very rarely more than 6 or 7 routes are utilized [38,39].In Figure 9, the effect of the maximum number of defined k (for the k-shortest path algorithm) on the results is drawn.If there is reliable information about the number of paths for the network under-study, the number of required sensors may be reduced dramatically.Table 2 shows the difference in locations results for two path setting values (k= and k= ).To check the robustness of the methodology, other nine different runs were made.Each run contains 100 different iterations.The number of links to be equipped with the remote emission sensors per each iteration is depicted in the box plot in Figure 10.The ANOVA test is performed to make sure that there is a significant difference in the results.The F critical ratio is equal to 1.98 with 9 degrees of freedom for the numerator and 90 degrees for the denominator at significance level = 0.05.The obtained F = 1.63 (F < F critical ) which means that the null hypothesis that the results are statistically equivalent is accepted; i.e., there are no significant differences among the different runs.

Conclusion
This study presents a novel methodology for solving the remote emission sensing monitors location problem which received little attention in the literature.It aims to find the locations and the minimum number of sensors to capture every vehicle emission within a network.The heuristics steps of the methodology are simple to deploy and general for any network.They only depend on the path enumeration criteria which are easily defined for transport networks.The path enumeration based algorithm proved to be valuable if the transportation planner could identify the actually used paths for each demand node pair and consequently would lower the required sensors dramatically.The methodology consistency and effectiveness are tested by using real medium size network.Sensitivity analysis is made to clarify each parameter effect on the results.The results are validated with exact methods (branch and bound) and also statistically with ANOVA test.The application of the proposed system would manage the stipulation of allowable emission standards within the network which can be a part of the solution to  the air contamination problem.A further challenge may be added when it comes to temporal dynamics.The methodology assumes intercepting each vehicle once a day; however, interception may be required within a defined time horizon.The work could be extended to optimize the locations for movable sensing systems through the network dynamics to achieve more reduction in the required number of sensors.A time-expanded network would be the right approach to adapt the proposed methodology.

Figure 1 :
Figure 1: Sensors system on a road section for vehicle emission remote sensing.
(i) An integer mathematical formulation is developed for the stated problem.It depends only on the path enumeration for network demand node pairs and does not require any data about the O/D matrix flow to intercept each vehicle.

Figure 9 :Figure 10 :
Figure 9: Number of links versus number of paths generated between each demand node pair.

Table 2 :
Link set structure for two solutions with different path settings.