A simulation study of cooperative and autonomous vehicles (CAV) considering courtesy, ethics, and fairness

Autonomous vehicles (AV) can be programmed to act cooperatively. Previous research on cooperative and autonomous vehicles (CAV) suggests they can substantially improve traffic system operations in terms of mobility and safety. However, these studies do not explicitly take each vehicle’s potential gain/loss into consideration and ignore their individual levels of willingness to cooperate. They do not account for ethics and fairness either. In this study, several cooperation/courtesy strategies are proposed to address the above issues. These strategies are grouped into two categories based on non-instrumental and instrumental principles. Non-instrumental strategies make courtesy/cooperation decisions based on some courtesy proxies and a user-specified courtesy level, while instrumental strategies are based only on courtesy proxies related to local traffic performance. Also, a new CAV behavior modeling framework is proposed based on our previous work on cooperative car-following and merging (CCM) control. With such a framework, the proposed courtesy strategies can be easily implemented. The proposed framework and courtesy strategies are coded in SUMO microscopic traffic simulator. They are evaluated considering different levels of traffic demand on a freeway corridor consisting of a work zone and three weaving areas of different types. Interesting findings are drawn from the simulation results, one of which is that the instrumental Local Utilitarianism strategy performs the best in terms of mobility, safety, and fairness. In the future, auction-based strategies can be considered to model how CAV make decisions.


Introduction
Recent research on Autonomous vehicles (AV) focuses on training them to make safe and efficient maneuvers from the viewpoint of individual vehicles. When a group of AV interact in the real world, even if they all make the "best" decisions individually, these interactions may not lead to a system-optimal state, similar to what the well-known Wardrop's first and second principles describe. This research targets an important emerging research area in AV: a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 for instrumental courtesy strategies, where decisions are made by rules that are designed to achieve certain goals. 4. Microscopic simulation studies are conducted to evaluate each courtesy strategy in terms of mobility, safety, and fairness.

Background
The courtesy strategies proposed and evaluated in this study are all based on the CCM framework. To make this paper self-contained, the CCM control is briefly described in this section. Abbreviations used in this study are defined in Table 1. CCM control facilitates merge maneuvers at lane reduction points caused by accidents, work zones, etc. It is built on ACC and CACC. Instead of following the lead vehicle in the same lane, an AV controlled by CCM follows the nearest (in terms of longitudinal distance) downstream lead vehicle regardless which lane it is in as illustrated in Fig 1. This downstream lead vehicle is referred to as the generalized lead vehicle, or G-lead vehicle for short. With this straightforward extension of CACC/ACC, a CCM-controlled AV could keep a "safe" longitudinal distance with the G-lead vehicle and facilitate smooth merging maneuvers before the lane closure point.
With CCM control, an AV's car-following behavior depends on whether its G-lead vehicle is equipped with an on-board device (OBD) to share real-time maneuver information. If the G-lead vehicle is equipped with OBD, the AV is assumed to have access to the G-lead vehicle's acceleration, speed, and position information. Otherwise, the AV can only know the G-lead vehicle's speed and position through its sensors (e.g., Lidar, camera). With CCM, the AV longitudinal behavior is described mathematically in Eq (1) below.
( where, u 2 (t) = acceleration of the following vehicle, € x 1 ðtÞ = acceleration of the G-lead vehicle, _ x 1 ðtÞ = speed of the G-lead vehicle, _ x 2 ðtÞ = speed of the following vehicle, r(t) = current longitudinal distance between the G-lead and following vehicles, η = jam distance, τ e = the desired effective time-gap, and k 0 = 1, k 1 >0, k 2 >0 = gains. In Eq (1), OBD = 1 (0) indicates the G-lead vehicle is (is not) equipped with OBD. To differentiate from the traditional CACC, we refer to OBD = 1 as G-CACC mode. On the other hand (when OBD = 0), the AV will drive in the G-ACC mode. Just like the CACC and G-CACC modes, the only difference between ACC and G-ACC modes is that the AV will follow the Glead vehicle, not the lead vehicle in the same lane.
This study assumes that all vehicles are AV and equipped with OBD. Therefore, for CCM control only G-CACC is applicable. With properly calibrated coefficients k 0 , k 1 , and k 2 in Eq (1), a property of string stability [19] could be achieved. In this study, the parameters calibrated by Van Arem et al. [20] are used.
With CCM, vehicles in different lanes cooperate with each other so that they are provided with safe target front gap (TFG) and target lag gap (TLG) before they reach the merging point. CCM requires all AV to follow the G-CACC mode all the time. However, this study relaxes this CCM assumption of full cooperation. As the example in Fig 2 shows, the ego vehicle in the right lane is trying to merge into the left lane. the ego vehicle would follow the target front vehicle (TFV) in G-CACC mode and send a cut-in request to the target lag vehicle (TLV) in the meantime. If TLV (also an AV) is courteous/cooperative enough to yield to the ego vehicle, it would approve the cut-in request and follow the ego vehicle in G-CACC mode (Otherwise, the TLV will not follow the G-CACC mode). When the TLG and TFG are sufficiently large (and other lane-changing criteria are met), the ego vehicle will change lane. More descriptions about the modified CCM strategy are given in the methodology section.

Behavior model
In this study, AV follow the behavior model outlined in Fig 3. The key component of the AV behavior model is courtesy evaluation, which is indicated by the yellow box in Fig 3. When an AV receives cut-in (e.g., lane change) requests for other vehicles, it will evaluate the current situation and decide whether to yield to others or not. More details about courtesy evaluation are given later in Section 3.2 Courtesy and Courtesy Strategies.

PLOS ONE
Simulation study of cooperative and autonomous vehicles considering courtesy, ethics, and fairness Fig 3 has three main types of stages: decision, process, and AV action. At every simulation step, an AV would begin with the "start" stage in Fig 3. After going through various subsequent stages, each AV will eventually arrive at least one AV action stage and execute the corresponding car-following and/or lane-changing actions. There are several important constraints on CAV acceleration. These constraints are detailed in the following four subsections.
3.1.1 Car-following (CF). AV longitudinal acceleration is denoted as Acc CF and is constrained by the following factors (CF constraint). If there are no vehicles ahead of an AV within 100m, the AV will drive in free mode. The vehicle speed oscillates around an expected speed of 33 m/s. An acceleration of 2 m 2 /s (deceleration of -0.5 m 2 /s) would be adopted if the speed of the AV is less (greater) than the expected speed. Otherwise, the AV will drive in CACC mode and follow the formula with OBD = 1 in Eq (1).
The AV would drive in the CF mode (either free or CACC) if it (1) does not have lanechanging motivation, (2) has not received any cut-in request, or (3) has received a cut-in request but decided not to yield. Under the CF mode, the G-lead vehicle in Eq (1) is simply the front vehicle (FV) in the same lane. Note that the CF constraint should also be enforced in some other AV action stages in Fig 3. For example, when an AV changes lane, it still needs to consider the CF constraint to avoid colliding with its FV. Fig 4, an MLC involves four participants, which are the subject vehicle (SV), front vehicle (FV), target front vehicle (TFV), and target lag vehicle (TLV). Acc CF is also used as a constraint here (i.e., the acceleration of SV should be equal or smaller than Acc CF ) so that SV would not collide with its FV. Besides on ramps and lane reduction points, MLC can also happen before off ramps when vehicles have to change lanes in order to exit a highway.

Mandatory lane change (MLC). As shown in
For HDV, a vehicle's lane-changing motivation typically increases as it approaches the merging point. In this study, AV follow "the earlier the better" principle and try to change lanes right after they enter the lane-change (LC) feasible region (e.g., acceleration lane for vehicles from the on-ramp to merge). However, successful lane changes are also subject to constraints such as sufficient TLG and TFG. Since an AV has to fully stop when reaching the merging point without being able to change lane, a deceleration constraint Acc MP (MP stands for merging point) is introduced to make this happen. This deceleration is implemented by introducing a hypothetical stopped vehicle at the merging point in the closed lane, and the subject AV follows the CACC mode when it reacts to this "stopped vehicle". The deceleration constraint Acc MP is described in Eq (2) below and the only required input is the SV distance to the merging point (i.e., "stopped vehicle").
where € x 1 ðtÞ and _ x 1 ðtÞ are the acceleration and speed of the hypothetical stopped vehicle, respectively, and they are both set to 0. r(t) is the distance of the SV to the stopped vehicle. All other parameters are defined in the same way as in the original CACC formula Eq (1).
To create enough space in the target lane for the SV to merge into, the SV first executes CCM (with Acc CCM , as indicated in Eq (1)) to generate a safe gap to the TFV. At the same time, the SV would perform a feasibility check to see if the TLV is ready for the SV to merge into its lane. This check is done by calculating the forced deceleration of the TLV assuming the SV is already in the target lane. If the calculated deceleration is larger than the maximum deceleration allowed by the TLV, the feasibility check is considered failed. The forced deceleration is calculated following the G-CACC formula (OBD = 1) in Eq (1) except that the following vehicle is the TLV and the G-lead vehicle is the SV.
The feasibility check is to make sure the intended lane change will not cause trouble to the TLV. If the feasibility check is passed, the SV would further execute a gap acceptance check just like human drivers do to ensure safety from the SV perspective. The difference is that the SV would adopt smaller gaps than human drivers. The gap acceptance check is based on a certain threshold rather than probability. A gap would be accepted if it is larger than a critical time gap threshold. Both front and lag gaps need to be larger than the threshold. The front gap can be created by the CCM strategy. However, sometimes the lag gap (because of TLV) is not enough for the SV to change lane. In this case, the SV would send a LC request to the TLV for creating a larger lag gap. Upon receiving the request, the TLV would perform a courtesy evaluation and decide what to do next. More details on the courtesy evaluation are discussed later in this paper.
To conclude, the acceleration required for the SV due to MLC is governed by three constraints, which are Acc MP , Acc CCM , and Acc CF . The minimum of them will be chosen. For lanechanging maneuvers of the SV, they are based on feasibility check and gap acceptance check.

Discretionary lane change (DLC).
The motivation of DLC is different from MLC. DLC is for pursuing higher speed or reducing travel time. In this study, AV DLC is modeled using utility. An AV would choose lanes based on their utilities. The utility of target lane i is defined as: where v t is the speed of the target vehicle, and v lane i is the average speed of target lane i. If the SV is currently in target lane i, the target vehicle is the SV's FV. If the SV is not in target lane i, the target vehicle is the TFV. A utility difference threshold is defined so that the SV would not change lanes too often. In other words, DLC itself has an initial cost. To overcome this initial cost, the utility gain from changing lane has to be greater than a threshold of 5 m/s. Once the utilities of each lane have been determined, the SV's DLC decision can be made. What happens next for the SV is not much different from MLC. The SV can still send requests via vehicle-to-vehicle communications for a larger TLG. The only difference is that Acc MP would not be considered as a constraint.
3.1.4 Courtesy acceleration. MLC and DLC are control logics from the perspective of the SV. In this subsection, courtesy acceleration (Acc courtesy ) is introduced to model the behavior of the TLV. If a TLV decides to yield to another lane-changing vehicle, Acc courtesy is the corresponding acceleration needed. Again, Acc courtesy is determined following the CCM framework in Eq (1), in which the following vehicle is the vehicle behaving courteously (i.e., TLV) and the G-lead vehicle is the one sending the cut-in request.

Courtesy and courtesy strategies
Some important concepts need to be introduced before going forward. Courtesy may mean a number of things in social interactions, and might be characterized as either a behavior (e.g. opening a door for another is the courteous thing to do) or a disposition (i.e. one has the character traits of being courteous). Presumably, AV have no dispositions, so we rely on courtesy in the context of MLC/DLC as cooperating with another vehicle on the road when a lane change is necessary as part of the flow of traffic. From here we can define three important terms for our model: courtesy level, courtesy proxy, and courtesy principles. Courtesy level is a pre-determined attribute of AV, and it measures AV's level of willingness to cooperate. Courtesy proxy is an environment-dependent variable, and it is utilized by AV to decide whether to cooperate or not. Courtesy principles (or courtesy strategies) are a group of principles and/or strategies taken by AV to achieve various goals. Fig 3 shows how AV in this study would decide whether to cooperate (i.e., show courtesyin the case of MLC/DLC, yield) or not when receiving Lane-Changing (LC) requests. In previous studies, this decision process is implicitly considered in AV's behavior model [8,9] whose disadvantages are discussed in the previous section. Here, courtesy is explicitly modeled by 1. following how human behave: AV with high courtesy levels would be more likely to yield in a given scenario, but AV can have different courtesy levels that determine their cooperation.
2. pursuing certain systematic goals. In other words, courtesy can be important for its own sake (i.e., as a principle of a good road system) or can be instrumental to some other good (e.g. to minimize congestion).
This paper provides a set of courtesy principles to guide courteous interactions among AV, using speed difference as a courtesy proxy. In future work this could be replaced by other proxies such as time to crash (as a proxy for safety) or required maximum deceleration rate (for passenger experience and safety). Noninstrumental principles have been considered elsewhere, albeit not for courtesy, but we set them aside here for simplicity's sake [21]. The proposed courtesy strategies are categorized into two groups: non-instrumental (Egoism and Altruism) and instrumental (Local Utilitarianism, Local Maximin and Local Egalitarianism), and they are detailed in the rest of this section. The strategies chosen are popular in autonomous vehicles research that focuses on ethical decisions made by these vehicles. Utilitarianism is one of if not the most commonly considered moral system for autonomous vehicle decision-making [22], while there has been recent attention to maximin as a proxy for popular social contract theories of ethics [23,24]. Egalitarianism is less studied but is seen as a proxy for a particular kind of fairness in interactions of vehicles in which the parameter in question is equalized, or the benefit distributed, between parties [25].

Egoism.
Under Egoism, Target Lag Vehicle (TLV) yields only when it would be in its own interest to do so-here, when it would not require the TLV to sacrifice too much of its own velocity. In this study, we use a raw courtesy level (RCL) to describe the maximum velocity loss the TLV will incur in the name of cooperating with other cars. For example, a TLV with a RCL of 5 m/s would yield in a scenario that requires it to decelerate by 3 m/s, since 3 m/s is less than the prespecified 5 m/s. The courtesy proxy (CP) used in Egoism is thus defined as the speed difference of the TLV before and after a lane change.
Courteous behavior of TLV would be triggered if CP Ego is less than the courtesy level of the TLV as shown in Eq (5). Otherwise, the TLV will ignore the cut-in request and choose not to cooperate.

PLOS ONE
Raw courtesy level can vary for each AV, representing its individualized willingness to cooperate. As shown in Eq (6), it is derived from a standard courtesy level which ranges from 0 to 1.
Here, 0 means AV will never yield to others, and 1 means that AV will always cooperate if there is a request. v s is the maximum speed of the subject vehicle (which usually is the speed limit on the road).

Altruism.
Under Altruism, TLV makes cooperation decisions from the perspective of the vehicle requesting courtesy (i.e., cut-in requests). In this case, TLV would yield if the utility gained by the SV requesting courtesy is less than what the TLV's courtesy level allows. For example, TLV with a RCL of 5 m/s would not yield in a scenario where the SV could speed up from 5 m/s to 11 m/s after a lane change, since the speed gain is greater than the prespecified courtesy level (11 m/s-5 m/s > 5 m/s).
The courtesy proxy for Altruism is defined in Eq (7), which is the speed difference of SV (i.e., the one sending cut-in request) before and after a lane change. As in Egoism, courteous behavior of TLV would be triggered if the courtesy proxy is less than TLV's RCL as in Eq (5). The courtesy proxy for each AV under Altruism is also calculated based on a standard courtesy level as in Eq (6).

Local Utilitarianism (LU).
Local Utilitarianism strategy accounts for the utility gain of the SV attempting to change lane and the sacrifice (i.e., utility loss) to be made by the TLV. The TLV would yield to the SV if the sum of the utility resulting from a change is positive, accounting for the change in utility in both vehicles. It is called local utilitarianism because only two vehicles are involved and the immediate impacts (speed changes in this study) are considered in the decision process. However, the local decision may also benefit the operations of the entire traffic system. While considering more vehicles and the broader impacts (e.g., missing a downstream green light) of local cooperative behavior is desirable, it is very difficult to quantify and capture such effects given the complex relationship between individual vehicle behaviors and traffic system dynamics.
One example of Local Utilitarianism (LU) is that TLV with a courtesy level of 5 m/s would yield in a scenario where the SV could accelerate from 1 m/s to 9 m/s after changing lane and the TLV would have to decelerate from 15 m/s to 9 m/s, since the sum of utility changes is positive (i.e., 9-1 + 9-15 = 2 > 0). The cooperative decision rules for TLV under LU are further defined in Eq (9). (

Local Maximin (LM).
The idea behind Local Maximin (LM) is that the worst off vehicle (here, the one with lower speed) should be made better off. In other words, LM aims to maximize the lower bound of the interaction, and like LU only considers the two vehicles in question. Based on this strategy, if the speeds of TLV before and after changing lane are 15 m/s and 7 m/s, respectively, and the corresponding speeds of SV are 3 m/s and 8 m/s, a courteous yield would be triggered, since the minimum speed is improved from 3 m/s (i.e., Min (15,3)) to 7 m/s (i.e., Min (7,8)).
The LM decision rules for TLV are the same as those defined in Eq (9) except that CP LM defined in Eq (10) is adopted instead of CP LU .

Local Egalitarianism.
Egalitarianism is related to LM: courteous behavior requires that the deviations between vehicle speeds from the real-time global average speed should be reduced. The courtesy proxy for Egalitarianism is defined in Eq (11).
where v G is the real-time global average speed, which is the average speed of all vehicles in the network. The decision rules defined in Eq (9) still apply for this Egalitarianism courtesy strategy, but for CP Ega . Assuming a scenario where the global average speed is 15 m/s; the speeds of TLV before and after changing lane are 16 m/s and 12 m/s, respectively; and the corresponding speeds of the SV are 5 m/s and 11 m/s. Since (16-15) 2 + (5-15) 2 > (12-15) 2 + (11-15) 2 , a courteous behavior is triggered. The Egalitarianism strategy does not consider the courteous behavior's impact on other vehicles except for TLV and SV, including the potential change in v G .

Simulation setup
To evaluate various courtesy strategies, a corridor comprising 4 critical segments is designed as in Fig 5. The four segments have different levels of traffic demand and require different types of lane-changing maneuvers. Two adjacent segments are separated by a 500-meter straight roadway segment without on/off-ramps. The various types of on-and off-ramps generate many opportunities for vehicles to behave courteously and to test the proposed courtesy strategies. From left to right in Fig 5, the four segments are Type A weaving area, Type B weaving area, work zone, and Type C weaving area, respectively. The layouts of these weaving areas are taken from the Highway Capacity Manual (HCM) [26].
To test the proposed courtesy strategies' performance under different levels of traffic demand, three levels of traffic inputs (light, moderate, and heavy) are adopted, corresponding to different levels of service (LOS) described in the HCM [26]. Table 2 shows some sample service traffic volumes and the corresponding LOS taken from the HCM. In this study, light traffic represents LOS A, moderate traffic demand indicates LOS C, and heavy traffic input is for LOS E. LOS B and D are omitted for simplicity. If we use the traffic volumes directly from the HCM, upstream highway mainline traffic + onramp traffic-offramp traffic may not equal downstream mainline traffic, leading to imbalanced flows. To ensure flow conservation, minor adjustments (in the range of 25~125 veh/h/lane) are made to the traffic volumes corresponding to LOS A, C, and E in the HCM as in Table 2. In Table 2, numbers outside parentheses are directly from the HCM and those inside parentheses are final traffic volumes used in this study

PLOS ONE
Simulation study of cooperative and autonomous vehicles considering courtesy, ethics, and fairness for light, moderate, and heavy traffic conditions. Note that only four volumes have been adjusted and are highlighted in bold font in Table 2. The same simulation parameters are used across different levels of traffic demand (e.g., free-flow speed: 120 km/s, volume ratio (weaving volume by non-weaving volume): 0.20, etc.) The proposed AV behavior models as well as courtesy strategies are coded and simulated using SUMO [27] via the Traffic Control Interface (TraCI). Each simulation run is equivalent to one hour in the real world. The AV behaviors are updated at 2 Hz frequency, which means AV make decisions every 0.5 seconds. Such decisions include longitudinal control (accelerate or decelerate), lateral control (change lane or not), and communications with each other (sending and receiving lane-changing courtesy requests). Each combination of courtesy strategy and traffic demand (light, moderate and heavy) is simulated 10 times with different random seeds. The 10 simulation runs for each scenario allow us to calculate the means and standard deviations of performance metrics, which can be used to characterize how stable and reliable the simulation results are. As illustrated in Figs 7-11 below, the shaded areas are based on the means and standard deviations, showing the 95% confidence intervals for each strategy at each courtesy level.
For non-instrumental strategies, courtesy level can have significant impacts on the results. Therefore, the same strategy with different courtesy levels is regarded as distinctive strategies during the simulation. Table 3 summarizes the strategies simulated/evaluated in this study. For non-instrumental strategies (i.e., Egoism and Altruism), a courtesy level needs to be specified and all AV apply the same courtesy level throughout the simulation. Various courtesy levels have been tested. The test results suggest that the network performance is more sensitive to low courtesy levels. Therefore, an incremental interval of 0.02 is used when the courtesy level is between 0 and 0.2, while an interval of 0.1 is used for courtesy level between 0.2 and 1.
Heterogenous courtesy is also considered. In this case, AV courtesy levels could follow certain distributions. We opt to use two distributions described based on human preferences for altruistic or egoistic AV behavior conducted at the Massachusetts Institute of Technology (MIT) [22]. In this example, an AV has to either kill its passenger for saving ten pedestrians or kill the ten pedestrians to save the passenger. The choices provided range from 0 'protect passenger at all costs' to 1 "minimize pedestrian casualties on the road". The 182 survey respondents were asked this question in two distinctive ways: (1) what will AV do? and (2) what should AV do? These two ways resulted in two distributions shown in Fig 6. Considering the connection between the MIT survey and this study (i.e., should AV be concerned more about its passenger or other people), the two distributions are also adopted in this research to model AV courtesy level: Courtesy Distribution Expected (CDE) for "what will AV do", and Courtesy Distribution Moral (CDM) for "what should AV do?"

Evaluation
The evaluation of courtesy strategies has been conducted at both network and individual segment levels in terms of mobility, safety, and fairness. At each time step, a vehicle could be in where v i is the global average speed for the i th kind of state. v j t is the speed of vehicle j at time step t. r j it ¼ 1 indicates vehicle j is in state i at time step t and r j it ¼ 0 otherwise. TS is the total number of states for all vehicles, which is the sum of numbers of time steps experienced by all vehicles.
Safety is measured by Deceleration Rate to Avoid a Collision (DRAC). In this study, only the DRAC of TLV is considered when the SV is merging into the target lane. The DRAC for

PLOS ONE
Simulation study of cooperative and autonomous vehicles considering courtesy, ethics, and fairness TLV is calculated as: where v SV and v TLV are the speeds of the SV and TLV (when the SV is changing lane), respectively. s is the corresponding space headway between the SV and TLV. DRAC TLV measures how risky it is when a specific lane change occurs. A global DRAC is also calculated as the average of all DRAC TLV observed during the entire simulation. Fairness is also an important measure when courtesy and cooperation are considered [28,29]. This study adopts Gini coefficient as a metric to measure system fairness. Two Gini coefficients are proposed: global Gini coefficient (G_GC) and categorical Gini coefficient (C_GC).
where n is the total number of state speeds for all vehicles (if a vehicle stays in the network for 10 seconds, there will be 10 state speeds for it). v i and v j could be any state speed of all vehicles.
� v is the average state speed. G_GC essentially measures how imbalanced the distribution of all state speeds is. The more imbalanced it is, the larger G_GC becomes. C_GC, on the other hand, measures how imbalanced the mobility of a particular state is.
where V k (k = 1,2,3) is the average speed for a state (i.e., courteous state, LC state and other state). � V is the average speed of V 1 , V 2 , and V 3 . Altruism, courtesy level is modeled by two approaches. The first approach assumes all AV have the same courtesy level, which varies between 0 and 1 and leads to different simulation scenarios (see the left halves of each subplot in Fig 7). The second approach considers the two distributions generated by the MIT survey [22], and in each simulation run different AV can have different courtesy levels based on the CDE and CDM distributions. The results from the second approach are presented in the right halves of each subplot in Fig 7. For Egoism and Altruism, the average speeds for all states in general increase monotonically as the courtesy level goes up regardless of traffic volumes-that is, as their tendency to cooperate increases according to Eqs (4), (5), and (7). This trend is also true for two component states, LC and other, except for some minor local fluctuations.

Mobility
For courteous state the impacts of courtesy level on average speed also depend on traffic. Under low traffic volume, average speed changes with the inverse of courtesy. This trend is reversed, however, with high volume. This suggests that the system benefits more from courteous behaviors when the traffic becomes more congested. When Egoism and Altruism are compared on the same courtesy level, it could be concluded that Egoism results in better overall (for all states) mobility performance under medium and heavy traffic. Although Egoism performs slightly worse than Altruism under light traffic, the differences are much smaller than those under medium and heavy traffic. After decomposing all states into three component states, the superiority of Egoism compared to Altruism primarily comes from its advantage in LC state and other state.
For non-uniform courtesy, the mean and standard deviation of courtesy distribution from the MIT survey are 0.58 and 0.35 for CDE, and 0.75 and 0.30 for CDM. The system performance considering CDE and CDM (see the right halves of each subplot in Fig 7) are compared to those with uniform courtesy levels set to 0.58 and 0.75, respectively. It is found that the global average speeds of uniform courtesy for LC and courteous states are consistently higher than or at least equal to those of non-uniform courtesy. It implies that for egoistic and altruistic courtesy strategies, a system with non-uniform courtesy may bring instability and thus cause system inefficiency. A closer look at the results of individual simulation runs for CDE and CDM suggests that some of them (about 1~2 out of 10) have surprisingly low average speeds (under 5m/s) for LC state, and there are no courteous state (means no vehicles yielded) at all. This is the same as what happens when the courtesy level is zero under uniform courtesy and no vehicles cooperate. A possible explanation is that although CDE and CDM have relatively high average courtesy levels, AV with very low courtesy level (on the left tail of the courtesy distribution) may sometimes result in system failure, where the courtesy proxy becomes very high (for both Egoism and Altruism) compared to RCL due to SV which attempt to change lane queueing up at the end of the segment. Under system failure, even AV with high courtesy levels would find it very costly or impossible to yield.
The mobility performance for CDM overall is better than CDE. This may be caused by the fact that the courtesy distribution of CDM is more skewed towards 1 compared to CDE (see Fig 6). Although under heavy traffic Egoism with CDM has smaller average speed for LC state than Egoism with CDE, this could be attributed to the randomness of simulation as more system failures occur for Egoism CDM (it is also hinted by the larger margin of standard deviation of simulation results).
Local Utilitarianism achieves the highest average mobility. This is an unsurprising result given that the courtesy strategy itself is encoded with a limited account of mobility maximization. At least in simple road networks we should expect that if individual interactions aim to maximize mobility, then the overall network will also maximize mobility.
Local Maximin and Egalitarianism perform about the same under light and moderate traffic. It is worth noting that Egalitarianism degenerates sharply under heavy traffic. This is because in this case the cooperative behavior of AV is guided by the real-time global average speed (as in Fig 8). After changing lane, the speed of SV is assumed to reach the speed of TLV, which is usually an increase during the process. When the global speed is high, the system under Egalitarianism encourages (see Eq (11)) TLV to behave courteously because the resultant speed will become closer to global speed (the speed of TLV is assumed to be unchanged after LC). However, under heavy traffic when the global speed is typically lower, egalitarian principles might discourage TLV to yield to other vehicles because a local speed increase after changing lane would increase the variation of the system's speed. Non-cooperation would further exacerbate the congestion situation and the global speed would decrease even more, finally leading to a system failure.

Safety
DRAC is a safety measure that mandates the least deceleration a TLV needs to execute for not colliding with a lane-changing SV. The smaller the DRAC is, the safer a yielding maneuver can be. In this study, each time a TLV yields to an SV, the corresponding DRAC is recorded and the average is shown in Fig 9. For Egoism and Altruism strategies with uniform courtesy, as the courtesy level increases, the DRAC first goes up and peaks at courtesy level = 0.04, then decreases and reaches the least value (less than 0.5 m/s 2 ) at a relatively low courtesy level (about 0.1 for moderate traffic and 0.2 for heavy traffic). This pattern is different from those seen in mobility results, where the system performance monotonically increases with courtesy level.
Altruism is found to be safer than Egoism when a small amount of uniform courtesy in introduced into the system, but converges with Egoism at about 0.1 courtesy level under all three traffic inputs. It is proved again that variance (i.e., standard deviation of courtesy level) added to courtesy level could negatively affect the network performance especially under high traffic input. For instrumental courtesy strategies, Local Utilitarianism achieves the most stable (very small variance) and lowest DRACs.

Fairness
An interesting pattern in the fairness evaluation results (Fig 10) is that a strategy with better mobility (for all states) often comes with better fairness. Egalitarianism under heavy traffic yields the lowest fairness performance, which is the opposite of its design intention. This probably is because it encourages courteous behavior that generates after-lane-change speed not surpassing the global average speed. When the global average speed is relatively low (usually under heavy traffic), cooperative behavior that can help to improve traffic operations and increase global speed would not be selected by the Egalitarianism strategy, although it works well under moderate and light traffic.

Vehicle state distribution
To better understand how each courtesy strategy affects system performance, the percentages of courteous state and lane-changing state generated by each strategy are computed and shown in Fig 11. Such percentages can reflect how often each courtesy strategy takes effects. Also, all courtesy strategies are evaluated using the same set of traffic demand (see Table 2). For each OD matrix, the required number of lane changes (i.e., to join/leave the highway) is the same for all strategies. Therefore, comparing the percentages of courteous and lane-changing states can be useful to understand how each courtesy strategy works, for example, how long vehicles are in the lane-changing process. When driving in a very courteous environment, the lane-changing state percentage (LCSP) should be lower than driving in an uncourteous environment where nobody yields, since it takes less time for vehicles to wait for safe gaps and change lanes.
It is worth noting that high courteous state percentage (CSP) does not necessarily result in low LCSP. Given a very congested on-ramp or work zone, courteous vehicles may need to wait for a long time for lane-changing vehicles to clear the bottleneck. In this case, both CSP and LCSP could be high. Fig 11 clearly shows that with uniform courtesy between 0 and 0.3, Egoism has lower LCSP and higher CSP than Altruism under moderate and heavy traffic. Recall that Egoism also has better mobility results than Altruism in Fig 7 under the same traffic and courtesy conditions. This suggests that more courteous behaviors contribute to reducing the LCSP and improving system mobility performance. It also suggests the importance of carefully defining courtesy proxy and courtesy level. Egoism defines courtesy proxy from the perspective of TLV, while Altruism defines it based on SV. As the magnitude of courtesy proxy (i.e., speed difference) is different for TLV and SV (the speed gained by SV is usually larger than the speed lost by TLV) during lane change, courteous behavior would be more likely to occur under Egoism than Altruism (Since the CSP for Egoism is higher than Altruism as indicated in Fig 11) under the same uniform courtesy level. This explanation is further illustrated in Fig 12 and Eqs (4)-(7). For the same courtesy level and speed loss and gain, Egoism (Fig 12A) will yield based on Eqs (4)-(6), while Altruism ( Fig 12B) will not based on Eq (7).
Another interesting finding from Figs 7 and 11 is that for a system to work well, the CSP should not be too low or too high. One example is the Egalitarianism. Low CSP indicates a lack of willingness for cooperation, which makes it difficult for SV (i.e., LC vehicles) to change lanes. While a very high CSP could be the result of a system failure.

Performance of individual segments
In addition to analyzing the performance of the entire network, the mobility results for individual segments and lanes are also collected. Table 4 shows the lane-by-lane average speed results for each of the four important segments.
Under light traffic, the rightmost lane of the type C weaving area's speed performance is among the lowest under all instrumental courtesy strategies, indicating this type C weaving area is the bottleneck of this corridor. For Egoism and Altruism with low uniform courtesy, it is hard for vehicles in the closed lane (i.e., rightmost lane) to merge in the work zone area, leading to low average speeds. Overall, the results suggest that Egoism and Altruism with low level of uniform courtesy do not work well under light traffic particularly for the type A and type C weaving areas and the work zone.
Similar trends could be found for moderate and heavy traffic. For Egoism and Altruism with uniform courtesy, a higher courtesy level is critical to accommodate the increased demand from lane changes. Congestion with low average speed is mostly found in the rightmost lanes of the type A weaving area and work zone and the middle lane of the type C weaving area. Results in Table 4 are consistent with the global average speed results in Fig 7. The low speed may be caused by few slow-moving vehicles seeking lane-changing opportunities due to a lack of cooperative TLVs.
To further understand the contributions of each segment to the global mobility performance. The contribution of the i th segment under each courtesy strategy is calculated using Eq (16).
where v g is the global average speed, and v −i is the average speed without considering vehicles in the i th segment. The idea behind Eq (16) is to calculate the removal effect, which tells the results difference between with and without the i th segment. A positive CT i value means having the i th segment does not worsen the global performance, while a negative value suggests the opposite. The distributions of segment contribution are plotted in Fig 13. It is clear that the type C segment worsens the global average speed for most of the time. The contribution ranking (from negative to positive) based on mean CT i is: type C, work zone, type A, and type B, and the contribution stability ranking (from highest to lowest) based on the variance of CT i is: work zone, type A, type B, and type C.

Conclusions and discussion
Cooperation is a critical component of roadway traffic, particularly with the introduction of autonomous vehicles (AV). Previous studies often implicitly assume all AV to be fully cooperative, without considering the many possible strategies for implementing cooperation and the ethical and fairness issues involved. In this study, a systematic scheme to model and evaluate AV's cooperative behavior is proposed. A total of five courtesy strategies are proposed. A modified rule-based AV behavior model CCM is presented and utilized to evaluate the proposed courtesy strategies. SUMO simulation results at network and individual segment levels suggest that: (1) vehicles adopting courteous/cooperative behaviors could also benefit themselves particularly under heavy traffic; (2) for Egoism and Altruism, uniform courtesy is more effective than nonuniform courtesy with the same mean courtesy level; (3) variables such as traffic OD, courtesy level, and courtesy distribution (only applicable for non-instrumental strategies) play important roles in determining a system's global performance. At the local level, segment type and lane position are additional parameters that affect traffic operations; (4) different performance metrics for the same courtesy strategy are correlated. For example, under moderate and heavy traffic Egoism has better mobility, worse safety, and better fairness performance than Altruism at the same level of uniform courtesy; and (5) Local Utilitarianism performs the best among all instrumental strategies in terms of mobility, safety, and fairness. Its performance is also very stable.
For future work, courtesy strategies following other principles can be explored. For example, an AV may yield with a probability, which can be either prespecified or dependent on real-time traffic conditions (e.g., how many times it has already yielded during that trip). AV may be able to earn credits for yielding to others and generating a net gain for the system. They can use such credits in the future to pay for other vehicles to yield to them. Also, in case multiple courtesy requests are received (although this does not happen often for lane changes on highways with two lanes in each direction), the request from the AV with the highest courtesy credit score will be selected to be served. Some of these ideas have already been explored in the context of intersection traffic management. For example, Dresner and Stone [30]  proposed a First-Come-First-Serve (FCFS) strategy in their pioneer work on intersection traffic control with all autonomous vehicles. Carlino et al. [31] further extended the work by Dresner and Stone [30] and proposed an innovative aution-based autonomous intersection management strategy. Although there are important differences between lane change and intersection management, it would still be very interesting to modify the FCFS and auctionbased methods for modeling courtesy in lane changes. When we put a price tag on courtesy, this may generate unwanted and complicated behavior if not properly handled. For example, a vehicle in the target lane may slow down on purpose to create a large/safe/attractive gap so that this gap can be sold at a high price. A possible but certainly not easy solution is that this vehicle pays to the system for the extra delay it has caused. Also, the lane-changing vehicle needs to determine the best gap to take given costs and the potential impacts on surrounding vehicles. These are all interesting but challenging problems to investigate and are beyond the scope of this paper.
In this study, the equations for LU, LM, and local Egalitarianism are not derived from rigorous mathematics and are essentially heuristic. Developing the optimal equations is a very challenging task both mathematically and philosophically. We will continue to investigate this important area in future research.