Joint optimisation of transfer location and capacity for a capacitated multimodal transport network with elastic demand: a bi-level programming model and paradoxes

With the growing attention toward developing a multimodal transport system to enhance urban mobility, there is an increasing need to construct new infrastructures, rebuild or expand the existing ones, to accommodate the current and newly generated travel demand. Therefore, this study develops a bi-level model that simultaneously determines the location and capacity of the transfer infrastructure to be built considering the elastic demand in a multimodal transport network. The upper-level problem is formulated as a mixed-integer linear programming problem, whereas the lower-level problem is a combined trip distribution/modal split/assignment model that depicts both the destination and route choices of travellers via a multinomial logit model. Numerical studies are conducted to demonstrate the occurrence of two Braess-like paradox phenomena in a multimodal transport network. The first one states that under fixed demand, constructing new parking spaces to provide the usage of park-and-ride services could deteriorate the system performance measured by the total passengers’ travel time, while the second one reveals that under elastic demand, increasing the parking capacity for park-and-ride services to promote its usage may fail, which would be represented by the decline in their modal share. Meanwhile, a numerical example also suggests that constructing transfer infrastructures at distributed stations outperforms building a large transfer centre in terms of attracting travellers using sustainable transit modes.


Introduction
Prompting urban multimodal mobility and the use of public transport has attracted increasing attention from both industry and academia, as it has been regarded as an effective strategy to mitigate severe urban congestion (Pineda et al., 2016;Zhao et al., 2019). One of the critical factors measuring the efficiency and attractiveness of a multimodal transport network is whether travellers can seamlessly egress from one subnetwork to access another, which is determined by the integration of different transport modes. For integrated multimodal transport services, increasing attention has been paid to the development of Intelligent Technology (IT) solutions (Xu et al., 2021). From our perspective, the development of IT solutions should also be driven by advances in modelling methodologies and built upon the understanding of multimodal networks. Thus, mathematical models still need to be established to facilitate the decision-maker to gain insight into the properties of the multimodal networks to develop effective IT solutions better. Hence, this study is motivated to model and solve one of the fundamental problems for integrating multimodal transport services. That is how to add new transfer links or expand transfer capacities, which is classified as the multimodal network design problem (Farahani et al., 2013). Furthermore, it is well known that a similar problem in a road transport network is alluded to the Braess paradox, which is a counterintuitive phenomenon stating that adding a new link or expanding the capacity of an existing link may not improve the network performance measured by the total travel cost in the network design problem. Notwithstanding, in the context of the multimodal network design problem, the paradoxical phenomenon has not been examined, though expected as a result of the contradiction between the user equilibrium and the system optimum. Thus, another motivation of this study is to demonstrate the characteristics of the paradox phenomenon in a multimodal transport network, as exploring the conditions for the occurrence of Braess paradox could help avoid Braess paradox and improve the system efficiency (Yao et al., 2019a).
With regards to the network design problem, existing studies can be classified into three categories in general based on their decision variables and how to model travellers' behaviour, that is, mode and route choices. In the first category, the decision variables are associated with each transport mode (i.e., bus route structure, frequency, or road link capacity) and are exclusive to each transport mode (Szeto and Jiang, 2012;Yao et al., 2012;Yan et al., 2013;Tang et al., 2020). This allows executing a traffic/transit assignment model independently on each transport subnetwork and employs a symmetric link performance function, that is, the BPR function (e. g., Beltran et al., 2009;Chen et al., 2020a,b;Lee and Vuchic, 2005;Szeto et al., 2011Szeto et al., , 2015Wang et al., 2021b). This category overlooks the fact that the decision variables of one transport mode could affect the network topology of another transport mode, and different transport modes could compete for the same limited infrastructure resources. For example, when buses and cars share the same lane, the construction or expansion of a bus lane leads to capacity reduction for cars, making transit lanes ineffective sometimes. This issue is addressed by the second category, which designs the allocation of infrastructure resources in a multimodal context (e.g., Elshafei, 2006;Mesbah et al., 2008;Li and Ju, 2009;Fan and Machemehl, 2011;Yao et al., 2015;Yu et al. 2015;Huang et al., 2021). The above two categories assume that travellers' modes and route choices are dedicated to one specific transport mode. In other words, travellers' trips only contain one travel mode1 1 , that is, car or public transport, which overlooks the intermodal travel behaviour, meaning that a traveller could utilise more than one transport mode to complete a trip, such as park-andride (P + R) and bike-and-ride (B + R). In the last decade, with the development of multimodal mobility, intermodal travel has become a prevailing option for commuters. In view of this, the third category emerged. In other words, the attributes of transfer nodes in a multimodal transport network (Alumur et al., 2012;Liu and Meng, 2014;Huang et al., 2020;Chen et al., 2016). This category captures a more realistic travel behaviour in which intermodal trips are considered in the route choices. Nevertheless, to the best of our knowledge, studies within this category are limited.
Irrespective of the modelling categories, a common methodology for formulating the multimodal network design problem is bilevel programming. For the upper-level problem, most of the existing studies on multimodal transport network design focus on the location, capacity, or parking fee of the P + R facilities Zhong et al., 2020;Chen et al., 2021;Liu et al., 2021), while the B + R facilities are ignored. It has been acknowledged that the B + R facilities could provide access/egress services for commuters who travel between 2 and 5 km to a public transport stop (Martens, 2004). Such transfer infrastructure has the advantages of less area occupancy and less impact on the nearby network. However, there is no existing study examining the simultaneous determination of the location and capacity of multiple types of transfer infrastructure considering the B + R facilities. Furthermore, most studies on P + R services declare that setting P + R facilities would shift commuters from the auto mode to transit and P + R modes (Wang et al., 2004;Liu et al., 2009;Wang et al., 2014;Pineda et al., 2016). Nevertheless, these studies were conducted under a fixed total travel demand; the impacts under variable demand have not been examined before. For the lower-level problem, a combined mode-split and trafficassignment model is commonly employed. Abdulaal and LeBlanc (1979) proposed an equilibrium condition in which it is assumed that all travel modes are available to commuters when they plan their route. Meng and Liu (2012) adopted a similar condition to formulate a combined mode split and traffic assignment problem to design a multimodal toll pattern in a bimodal network. Xu et al. (2018) considered the combined mode and route choice behaviour for estimating the multimodal network capacity in a bimodal network. Wang et al. (2020a) applied the combined mode split and traffic assignment equilibrium model to a real-size multimodal network with auto, bus, and metro sub-networks but ignored the transfer modes. All these studies combined mode split and traffic assignment with a fixed origin-destination (OD) demand pattern. However, with the changes in the transport infrastructure, the fixed OD demand assumption may not be valid, as more trips can be generated and distributed to different destinations. To model such a phenomenon, Oppenheim (1993) developed a logit-based function to capture the elastic travel demand, which was then extended by Yang et al. (2000), Ho et al. (2006), andChu (2011). However, this has not been examined in a multimodal transport network design problem.
The Braess paradox has been in-depth examined in both traffic and transit networks (e.g., Fisk 1979;Hallefjord et al. 1994;Yang and Bell, 1998;Braess et al., 2005;Szeto et al., 2009;Zhao et al., 2014;Szeto and Jiang, 2014b;Bagloee et al., 2019, etc.). We refer to Nagurney and Nagurney (2021) for the latest summary of the Braess paradox. Our literature review identifies the following research niches: First, most existing studies focus on a single travel model. Only Fisk (1979) examined a paradox in a bimodal network in which reducing auto flow triggers increased transit costs. Second, very few studies have examined the paradox phenomenon under variable demand. Although Hallefjord et al. (1994) and Bagloee et al. (2019) considered elastic demand, both rely on an explicit demand function to depict the change in demand as a function of travel cost. Third, it has been pointed out that the 1 Walk is considered to access or egress a mode instead of an independent mode to complete a trip. paradox can occur in a network with either a flow-dependent or an in-dependent link performance function (Wang and Szeto, 2017;Yao et al., 2019b). None of the existing studies demonstrates the occurrence of a paradox in a network with a mix of link performance functions. In view of the preceding findings, this study is motivated to illustrate the paradox phenomenon in a multimodal transport network under variable demand with different link performance functions.
To conclude, Table 1 provides an overview of the existing literature on bi-level multimodal network design problems and highlights the differences in our model. The contributions of this study are summarised as follows: 1) We developed a bi-level model for the multimodal network design problem to determine the transfer location and capacity simultaneously. As revealed in Table 1, the two decision variables were not considered together. Moreover, both P + R and B + R facilities can be provided in a multimodal transport network. 2) We devised a combined trip distribution/modal split/traffic assignment as the lower-level problem, capturing the demand elasticity and passengers' route choice behaviour as a result of the construction of the transfer infrastructure. As can be observed from Table 1, the lower-level model in previous studies overlooks the trip-distribution problem. Meanwhile, most of these studies are established on the fixed demand assumption, except Szeto et al. (2015) and Jiang and Szeto (2015), who encapsulated a Lowry-type land-use and transportation interaction model to address the demand elasticity. 3) We conducted experiments to demonstrate the occurrence of a Braess-like paradox phenomenon in a multimodal transport network, stating that providing P + R services may not be beneficial to the system performance measured by the total travel cost, and examine the impacts of different parameters on the occurrence of the paradox.
The remainder of the paper is organised as follows: Section 2 introduces the network representation for the multimodal transport network, assumptions, and notations, and then presents the bi-level formulation. The solution method is described in Section 3. Section 4 presents the experiments to illustrate the occurrence of the Braess-like paradox in a multimodal transport network. Finally, Section 5 presents the conclusions and outlines future directions.

Problem description
We consider an urban transport system and adopt a supernetwork (Sheffi, 1985) approach to depict multimodal transportation networks. The supernetwork is denoted by G = (N, A), where N and A represent the set of nodes and links, respectively. It contains |M| subnetworks and each subnetwork is represented by G m = (N m , A m ), which corresponds to travel mode m, that is, auto, metro or bike. A commuter travelling between nodes o and d can travel via either a single mode in one subnetwork or an intermodal trip through multiple subnetworks, that is, P + R or B + R. Accordingly, we use sets M and M to denote the travel mode that utilises one subnetwork and multiple subnetworks, respectively. For the travel mode m ∈ M, there exists a set of potential transfer nodes where a traveller can transfer from one subnetwork to another. The link connecting two transfer nodes at different subnetworks is termed a transfer link. The set of all possible transfer links for the mode m ∈ M is denoted by A m .
To facilitate the model development, the following assumptions are made. A1) The link performance function on each subnetwork is independent (e.g., Mesbah et al., 2008;Szeto et al., 2015;Yao et al., 2015). This is reasonable for metro and road subnetworks. As for the car and bus subnetworks, the assumption is valid on the condition that buses are operated in exclusive bus lanes.
A2) In line with the prevailing network design literature (e.g., Beltran et al., 2009;Szeto et al., 2010Szeto et al., , 2013, the link performance function is continuous and differentiable, and it is formulated as a function of the flow travelling on the link. A3) The flow on auto subnetwork links refers to vehicle flow, while the flow on metro, bike, and bus links refers to passenger flow. The vehicular flow is transformed into passenger flow by the car occupancy rate (Lam and Small, 2001). A4) Changes in the network affect residents' travel behaviours, including trip destination, travel mode, and routes. Meanwhile, it is assumed that commuters' mode and route choice behaviour follows the multinomial logit distribution in line with the prevailing literature (e.g., Yang et al., 2000;Lam et al., 2008;Liu and Meng, 2014).
A5) For links on each subnetwork, the soft capacity constraint is imposed, meaning that the flow is allowed to be greater than the capacity at an additional cost. For the transfer links, a hard capacity constraint is imposed, representing the limited transfer capacity, such as limited parking spaces in the P + R mode or limited docking slots for the B + R mode.
Based on the preceding assumptions, the multimodal network design problem considered in this study is as follows: 1) In the upper-level, the decision-maker plans to construct infrastructures to enhance urban mobility, measured by the total number of trips generated, and promote the usage of green travel modes, such as P + R and B + R. The corresponding decision variables comprise where to build the infrastructures and their corresponding capacities, that is, at which node in the network the parking space and bike docking slots or public bike share facilities should be constructed under the budget constraint. Mathematically, they are denoted as ξ m a and, c m a respectively. ξ m a is a binary variable that equals 1 if candidate link a is selected to construct the transfer infrastructure; otherwise, it equals 0. If transfer link a is selected, the corresponding capacity provided is denoted as c m a , which is a number between the minimum and maximum capacities, that is, C m,min a and C m,max a . Once the capacities associated with transfer links are determined from the supernetwork shown in Fig. 1, the decision-maker can easily infer the transfer nodes at different subnetworks.
As it is already known that providing P + R facilities can generate additional traffic (e.g., Parkhurst, 1995Parkhurst, , 2000 and, similarly, linking bike stations and metro stations can lead to increased trips (Noland et al., 2016), the objective of the upper-level model is, thus, devised to examine the maximum number of trips that can be generated, which is bounded by the maximum total travel demand in the area that could be predicted via big data. From such an objective, the decision-maker can infer the maximum number of land-use activities or zonal development potential (Yang et al., 2000).
2) The model in the lower-level depicts the passenger's behaviour for the newly generated trips. This study simultaneously captures destination choice, travel mode choice, and route choice. Mathematically, we use q + rs and q m+ rs to denote the number of newly generated trips between OD pairs rs and the number of passengers using mode m. Meanwhile, we use f m+ p to represent the number of new passengers travelling via path p of mode m. To differentiate between existing and newly generated passengers, we put "0 ′′ in the superscript. For example, f m0 p denotes the number of existing passengers travelling via path p of mode m, which can be either obtained from the assignment model or estimated using real data.

Notations
The following notations are used in the proposed bi-level model: The functions of the variables are explained when they are used.

Bi-level formulation
The bi-level optimisation model developed for the multimodal network design problem is presented in the following two subsections.

Upper-Level problem
The objective of the upper-level problem is to maximise the total number of trips generated.
are the vector notations for the corresponding decision variables. Eq. (2) is the budget constraint, where G(ξ) is the cost function of constructing infrastructures and is generally assumed to be non-negative, increasing, and differentiable (Yang and Bell, 1998) with respect to capacity. Eqs. (3) and (4), respectively, set the upper and lower bounds for the number of trips generated at each origin and destination node. q m+ rs (o) denotes the newly generated trip travelling via mode m between nodes r and s. It will be determined at the lower-level once the total number of new trips is specified. Therefore, it can be expressed as a function of o. Eq. (5) represents the capacity constraints associated with each building of transfer infrastructure. If a transfer location is determined, that is, ξ m a = 1, then the capacity to build should be within ; otherwise, ξ m a = 1 and c m a = 0. Eq. (6) is the capacity constraint at the transfer link, stating that the flow traversing link a, denoted by v m a (o), should be no greater than the link capacity. The capacity constraint applies to the travel mode m ∈ M, in which the transfer flow is strictly bounded by the infrastructure capacity, that is, the number of parking slots. Eq. (7) is the definitional constraint for the variable ξ m a . Finally, Eqs. (8) and (9) are nonnegative constraints for variables o + r and d + s , respectively.

Lower-level problem
At the lower-level, a combined trip distribution/modal split/traffic assignment model was developed. where subject to The objective function of the lower-level model contains four elements defined by Eqs. (11) -(14). z 1 replicates the mathematical formulation for the stochastic user equilibrium. z 2 and z 3 represent the entropy functions on mode choice and destination choice behaviour, respectively. z 4 is the inverse of the elastic demand function. Eqs. (15), (16), and (18) are the flow conservation constraints. Eq. (17) states that the number of passengers travelling via path p of mode m includes the newly generated passengers who choose the path in question and the existing passengers who use this path. Eq. (19) is the definitional constraint for link flow and path flow. For consistency, it is assumed that the modal and route choices of existing commuters are characterised via the multinomial logit discrete choice model, which is the same as the newly generated travel demand. By examining the Karush-Kuhn-Tucker (KKT) conditions of the lower-level problem, it can be proved in the Appendix that the demand distribution, travellers' mode choice, and route choice follow a multinomial logit distribution. Compared with the model developed by Yang et al. (2000), our contributions are twofold. First, the stochasticity in passengers' route choice is considered in the sense that passengers' route choice behaviour follows the stochastic user equilibrium instead of user equilibrium. Second, an additional mode choice component is encapsulated to capture travellers' mode choice behaviour and prove that the resultant mode choice follows a multinomial logit model.

Properties
Here, we briefly discuss the following properties of the bi-level model.
1) Solution existence. The existence of a solution to the lower-level problem can be generally guaranteed because the link performance functions are continuous and differentiable, and the solution space is bounded by the travel demand (Sheffi, 1985). Thus, the feasibility of the bi-level model depends on whether the capacity constraint in the upper-level is violated. Because the hard capacity is only applicable to the newly constructed transfer infrastructure, a feasible solution that always exists is the doing nothing solution. 2) Capacity constraint. In the bi-level formulation, the flow capacity constraint can be placed in either the upper-or lower-level model. Nevertheless, as noted in Yang et al. (2000), there are different behavioural implications, and the results vary. When it is placed at the lower-level, the resultant model yields a queuing time on saturated links. The Lagrange multipliers associated with binding capacity constraints are normally interpreted as queuing delay times (e.g., Bell, 1995;Lam et al. 1999Lam et al. , 2002. In such a case, the number of newly generated trips tends to reach its maximum value at the expense of the queuing time. In contrast, when the capacity constraint is imposed at the upper-level, it leads to an assignment model with no queues. Accordingly, the number of newly generated trips may not reach the maximum value, depending on whether there exists a saturated link. Finally, it is not necessary to put it at both levels, as in such a case, the capacity constraint in the lower-level will ensure that the capacity constraint is satisfied in the upper-level automatically.

Solution algorithm
To solve the bi-level programming problem, most existing studies adopt heuristic (Yang, 1995;Chiou, 2005;Angelo and Barbosa, 2015), metaheuristic (Fan et al., 2014a;Koh, 2007), and matheuristic (Szeto and Jiang, 2012;Jiang et al. 2013;Szeto and Jiang, 2014a;Carosi et al., 2019;Liu et al., 2019;Jiang, 2021) methods, given that a bi-level network design problem is a well-known NPhard problem. Because the primary purpose of this study is to examine the effect of optimising transfer capacity under variable demand, the classic GA is employed to solve the bi-level model.
It should be noted that although our problem exhibits a format similar to that in Yang et al. (2000), the successive linear programming (SLP) in their study cannot be directly applied to solve our bi-level model because our upper-level model is formulated as a mixed-integer linear programming problem. Nevertheless, once the decision variables ξ and c are given, the resultant upper-level model is reduced to a linear programming method, and the SLP method proposed by Yang et al. (2000) can be adopted. Therefore, in the GA, only ξ and c are encoded as decision variables.
In our preliminary experiments, it was found that a direct application of the SLP algorithm depicted in Yang et al. (2000) may not guarantee the convergence of the model because the step size for updating the decision variables in each iteration is insufficiently described in their study. Therefore, inspired by Zhang et al. (1985) and Ferris and Zavriev (1996), this study develops an adaptive step size method for updating a trust region, where the trust region can be understood as the lower and upper bounds of the decision variables in the linear programming model, and the adaptive step size determines the changes in the size of the trust region. The algorithm is described below, where the superscript k is introduced to indicate the iteration number when necessary.
Algorithm:. An SLP with an adaptive step size for solving the maximum trip generation model.
Step 1: Initialise the iteration counter k = 0, set the initial lower and upper bounds for  r∈R , and set ε as the parameter governing the convergence of the algorithm.
Step 2: Determine initial trip distribution pattern Step 3: Solve the lower-level combined trip distribution/modal split/traffic assignment model and obtain Step 4: Calculate the approximated influence factors l ar and l sr .
Step 5: Formulate a linear approximation model for the upper-level model by replacing q + rs (o) in Eqs. (3) and (4) and v a (o) in Eq. (6). Step 7: Check convergence.
Step 8: Update the step size. Step 9: Set k = k + 1 and return to Step 3.

Remarks of the above algorithm. (1) The maximum trip generation model is a bi-level model with given ξ and c. (2) In
Step 1, the initial solution region, that is , for the decision variables can be set large, as it will shrink over iterations. The notation with subscript r implies that each decision variable o + r is associated with its parameters for updating the step size. ρ r does not contain the superscript k, as the algorithm adopts a constant rate to update the solution region. In the future, it will be possible to develop a scheme that can update the ratio over iterations.
(3) In Step 3, the influencing factors can also be obtained via a sensitivity analysis (e. g., Tobin and Friesz, 1988;Yang, 1997). (4) Compared with the method proposed by Yang et al. (2000), our extensions are twofold. One is to include a trust region when solving the linear programming problem, and the other is to update the trust region adaptively in Step 8. (5) The proof of convergence when using the SLP method to solve the nonlinear optimisation programming problem can be found in Zhang et al. (1985) and Ferris and Zavriev (1996). However, as our model is a bi-level model, a theoretical proof of convergence and global optimality may not be easily established. It can be a challenging research direction to be explored in the future.
In this study, we only examined the convergence numerically.

Numerical study
The numerical studies were designed to illustrate the properties of the model and examine two Braess-like paradox phenomena in a multimodal transport network. The algorithm was coded using MATLAB 2018. All the input data and results were deposited in the author's GitHub repository (https://github.com/hkujy/CapacitedMNDP).

Occurrence of the paradox
The first experiment was conducted to demonstrate that introducing P + R services could induce a Braess-like paradox in a multimodal transport network under a fixed OD demand. We constructed a small network that resembled the classic Braess network, as shown in Fig. 2. The network contains four nodes, where node B represents a metro station and node A denotes a potential parking area close to node B. There are four links, and their travel modes and link performance functions are listed in the figure. The travel demand between nodes O and D was set to 2000. Compared with the original setting in the Braess network, our network contains links with a constant travel time to resemble travellers' walking time. Fig. 2(a) and (b), respectively, represent the network before and after constructing parking slots at node A, so that P + R is available. In the previous scenario, there are two travel modes, and each mode has only one path, as shown in Fig. 2

(a), that is, using a private vehicle via path O-A-D and using a metro via path O-B-D. In the following scenario, when a parking area is constructed at node A, the P + R mode is available to the travellers via paths O-A-B-D.
To obtain the flow distribution in the network, only the lower-level model is solved because the upper-level objective is constant under a fixed demand. Accordingly, a capacity constraint is imposed on the lower-level problem. Table 2 summarises the total travel time and flow distribution obtained before and after the scenarios. This shows that the total travel time in the after scenario is higher than that in the before scenario. This is similar to the Braess paradox (Braess, 1968), which states that adding a new link deteriorates the network performance measured by the total travel time due to travellers' selfish routing behaviour. In our example, this is because the travel time of the P + R mode (path O-A-B-D) is much less than that of the other two modes; thus, it attracts most of the OD travellers. On the one hand, the travel time via link 2 slightly decreases because of the reduction in the number of car users. On the other hand, the travel time associated with link 1, through with travellers' access to parking facilities, increases significantly. Thus, the overall travel time of the network increases, and a paradox occurs. Furthermore, from a mathematical point of view, the formulation of the lower-level problem only ensures that the assignment results follow a logit distribution instead of distributing passengers to achieve system optimisation.

Effect of capacity on the occurrence of the paradox
This section examines the effect of capacity constraints, that is, limited parking space, on the occurrence of the paradox, which has been ignored in previous examples by assuming that there is enough parking space. In the test, the capacity varied from 100 to 2000. Fig. 3(a) plots the total travel times of the before and after scenarios, whereas Fig. 3(b) displays the corresponding changes in the path flow. Fig. 3(a) shows that a paradox occurs as long as the parking space is provided. The total travel time reached the maximum when its capacity was 650. Afterwards, the total travel time starts to decline and reaches a stable level after the capacity is greater than 950. Fig. 3(b) shows that the increase in parking capacity attracts more travellers to use the P + R model, represented by the climbing of the flow associated with path 3. Meanwhile, it is observed that the flow travelling via path 3 always equals the capacity when the capacity is set below 950, meaning that the capacity constraint is binding. This implies that the number of travellers willing to travel via the P + R could be larger than the available capacity, which could induce a queuing delay time (see Section 2.3.3 for the discussion of the model properties) added to the travel time of path 3. The results justify the necessity of optimising the capacity of parking space for P + R services in terms of reducing the total travel time.

Effect of the scale parameters on the occurrence of the paradox
This section illustrates the effect of the scaling parameter in the logit model on the occurrence of the paradox. In the toy example, there are three paths corresponding to the three different travel modes. Because each mode contains only one path, only the scaling parameter γ affects the flow distribution. Theoretically, when its value increases to infinity, the result of the stochastic assignment model is close to that obtained from the user equilibrium model. In this experiment, the scale parameter was varied from 0.05 to 0.55 Fig. 4 plots the total travel times obtained from the before and after scenarios in (a) and the changes in the flow distribution among the three paths in (b)-(d). Fig. 4(a) shows that a higher value of the scale parameter diminishes the travel time gap between the before and after scenarios but does not eliminate it. Fig. 4(b) shows that the capacity is binding despite the value of γ, as the flow on path 3 equals the capacity all the time. In contrast, in Fig. 4(c) and (d), the capacity constraint is not binding initially, which explains the results in Fig. 4(a) that the curves corresponding to c = 600 and c = 1000 start from the same point and have equal total travel time.

Paradox under variable demand
In the following experiments, we adopted the network shown in Fig. 2 and varied the parking capacity from 100 to 2000. The existing and maximum demands were set to o 0 = 100 and o max = 2000, respectively. The results are presented in Fig. 5. Fig. 5(a) and (b), respectively, plot the changes in the flow of the three paths and the number of newly generated trips under variable capacities. In general, it can be observed that both the path flow and the number of newly generated trips increase with the transfer capacity until the number of newly generated trips reaches the maximum total demand. Notably, the flow travelling via path 2 grows more dramatically compared with those travelling via paths 1 and 3. Our explanation is that the changes in the travel time of path 1 are much gentler with the increase in the flow compared with those of the other two paths. This is verified in Fig. 5(c).
Under variable demand, the total travel time may not be a consistent measure because of changes in travel demand. Therefore, we examined the proportion of travellers using P + R with the increase in parking capacity and plotted the results in Fig. 5(d). An interesting phenomenon is observed: expanding parking capacity could reduce the proportion of travellers using P + R services. This Table 2 Occurrence of the Braess-like paradox. indicates that adding more parking spaces does not make the P + R services more attractive to the newly generated travel demand. This could be considered a paradox because it is in contrast with our intuition that increasing the P + R capacity would increase the attractiveness of such a service. Nevertheless, despite the decrease in the share of P + R services, the market share of metro services continues to grow mildly. In terms of the path cost, as expected, it increases with the improvement in transfer capacity because a higher transfer capacity permits more travel demand to be generated in the network and induces higher congestion costs.

Discussion of paradox
In summary, experiments under fixed demand extend the Braess paradox to a multimodal context. Compared with existing studies, our network contains both flow-dependent and non-flow-dependent link performance functions to capture passengers' travel times via different modes and transfer walking time. Nevertheless, we acknowledge that the network and link performance are hypothetical, and the results are theoretical. Thus, future efforts are required to explore the empirical evidence. As the occurrence of the Braess paradox has been noted in real-world road networks (Bagloee et al., 2019), it might not be impossible to detect it in a multimodal transport network.
Under elastic demand, our findings concur with those in the existing literature (Parkhurst, 1995(Parkhurst, , 2000Noland et al., 2016) in the sense that providing parking and biking stations induces more trips. Interestingly, our case study in Section 4.2 shows that, although the total number of trips for P + R continues to increase, a high proportion of the newly generated travel demand travels via the public transport model. Intuitively, this is because providing parking space generates a number of trips, whereas the number is larger than the parking capacity. Then, passengers cannot travel via the P + R and are distributed between public and private transport. Because the travel cost associated with public transport (e.g., metro in the example) grows mildly with the increase in travel demand compared with the cost associated with private transport, more passengers select public transport. The findings of this experiment underpin the necessity of maintaining a high quality of public transport when providing P + R services.

Effect of multimodal transfer capacity
This experiment was conducted to illustrate the design of the optimal transfer capacity under budget constraints in a multimodal transport network. We consider a more general network containing 12 links and seven nodes, as shown in Fig. 6. There are two transfer infrastructure options: one is to build parking spaces at node 7 to facilitate P + R trips, and the other is to construct bike docking slots at node 5 to promote B + R users. There are two OD pairs in this network, that is, OD pair (1,4) and OD pair (6,4), with the same existing demand of 100 commuters. Detailed route data and link performance functions can be assessed from the Github repository of the author.

Convergence of the SLP with an adaptive step size
Before demonstrating the effects of the capacity constraint, we first examine the performance of the SLP algorithm with an adaptive step size. The parameter for regulating the step size is increased from 0.3 to 0.9. In the experiment, the capacities associated with links 5-2 and 7-3 were set to 100, and the scale parameters were set at 0.1. The convergent plots are presented in Fig. 7. It can be observed that the algorithm converges under the proposed adaptive step size scheme.
The solutions obtained for different step sizes are listed in Table 3. First, it can be observed that the flow travelling via link 7-3 approaches the link capacity. The binding of this link confines the number of newly generated travel demands. There is even a residual capacity on link 5-2. No additional travel demand will be generated. Second, the solution obtained under ρ r = 0.3 deviates from that obtained when ρ r = 0.6 and ρ r = 0.9. This is because a smaller step size may cause the algorithm to converge to a local optimal solution.

Effect of capacity constraint
In this experiment, the capacities of the two links simultaneously varied from 0 to 400. The total newly generated demands under different scale parameters are illustrated in Fig. 8, where the x-and y-axis denote the capacity of links 5-2 and link 7-3, respectively, and the contour represents the new trips generated.
The following conclusions can be drawn.
1) First, it is obvious that increasing the capacity attracts more travel demand. This reveals that providing more infrastructure could lift the total travel demand. 2) Interestingly, expanding only one link could induce a higher travel demand than expanding both. For example, in Fig. 8(a), the number of newly generated trips is higher when the capacity of link 5-2 is 100 than when both capacities are 50. This implies that when the budget is tight, it is better to provide sufficient transfer capacity at one location. This rationalises one of the motivations of this study. In other words, we incorporate the decisions of where to construct the transfer infrastructure into a multimodal transport network design problem. 3) When the transfer infrastructure is constructed at both links, expanding the capacity of one link may not further induce higher demand. For example, in Fig. 8(a), when the capacity of link 7-3 equals 50, the total number of newly generated demand is no more than 400 and remains constant despite the growth in the capacity of link 5-2. This is because, as revealed in Table 3, the capacity of one of the links is binding, which restricts the growth of the total travel demand. This means that the capacity must be increased  *Note: v a is the link flow and v a /c a is the flow/capacity ratio.
simultaneously, and the capacity at each location should be optimised. This justifies the necessity of developing an optimisation model to determine the capacity allocation at different stations. 4) Finally, as can be observed by comparing the two subfigures, more demand is generated at a lower scale parameter. This is because the path flows are more spread among different options when the scale parameter is low. Accordingly, the path that traverses the capacitated links attracts fewer passengers, and these links become binding at a higher number of newly generated demand.

Optimal design in the Sioux-Falls network
The Sioux-Falls network is adopted for investigating the optimal design of multimodal transfer nodes using GA. Since the original Sioux-Falls network only contains auto links, this study adds a bike network as well as the transit lines to the Sioux-Falls network to mimic a multimodal urban transport network (see Fig. 9.). There are total 65 nodes and 121 links in the modified multimodal Sioux-Falls network. The origin nodes 3, 2, 13, 20, 14 are respectively identical with the nodes 26, 36, 35, 41, 32 and the nodes 43, 57, 45, 63, 48 in physical space while the destination nodes 9, 10, 8, 16 are identical with the nodes 29, 30, 38, 39 and the nodes 52, 53, 59, 16 in physical space. The maximum potential demand of each origin is 2000. The central urban area is highlighted by the blue oval. The 8 green nodes and dotted links are the candidate transfer links. The total construction cost function is given by G = 5 × c P+R a,max +c B+R a,max and the total budget 300. More detailed link attributes and the demand setting can be obtained in the author's GitHub.
To solve bi-level model, we use the Matlab GA toolbox with the proposed SLP algorithm. Meanwhile, the GA was run for 8 times with different random seeds. The algorithm took 45242.9 s to finish 100 iterations on average, and the average best objective value over 100 iterations is plotted in Fig. 10. The best objective value of the 8 runs is 4275. In the optimal solution, there are no parking spaces suggested to be constructed at P + R transfer nodes, but 300 biking docks constructed at node 62.
The output results show that if the optimal objective is the network capacity, the B + R transfer node is preferred as it occupies a smaller area per unit. It is worthy of acknowledging the results of the model represent an ideal scenario under various assumptions. Particularly, the premise that drivers will switch their travel mode from P + R to B + R as long as the parking infrastructure is replaced with biking infrastructure might be strong and unrealistic. Nevertheless, the results imply the potential of promoting the usage of the bike to enhance urban mobility. Therefore, it will encourage the decision-maker to consider and implement other strategies that could achieve a joint effect in promoting the usage of bicycles, such as better integration of bike and public transport, expanding and construing dedicated bike lanes, and providing a safe and healthy cycling environment, etc.

Conclusion
In view of the boom in multimodal mobility, this study develops a bi-level model for the multimodal network design problem. The upper-level problem is to determine the location and capacity of the transfer infrastructure to be built simultaneously. The lower-level problem is the assignment of the combined trip distribution/modal split/traffic. It is proved that at optimality, the trip distribution and commuters' route-choice behaviour depicted by the optimal solution of the lower-level problem follow a multinomial logit distribution. Numerical studies were conducted to illustrate the optimal design of transfer capacity in a multimodal transport network and a Braess-like paradox phenomenon in a multimodal transport network, stating that constructing parking space to stimulate the usage of P + R service may induce higher total passenger travel time. Moreover, it is found that under variable demand, the modal share of the P + R service for the newly generated travel demand could decline with an increase in the capacity of parking spaces.
This study provides various research directions. For example, (1) the lower-level model assumes that the travellers' route choice follows the multinomial logit model. It would be more interesting to examine other behavioural models, such as nest-logit (e.g., Zheng (2) The network and link performance functions used in the example to demonstrate the occurrence of the paradox are hypothetical for theoretical interests. In practice, as the Braess paradox has been observed in road networks in the real world (Bagloee et al., 2019), it is expected that such a phenomenon can also be observed in a multimodal transport network. Other paradoxes like the Downs-Thomson paradox, which were not considered in this study, might also occur in a multimodal transport network. (Ku et al., 2020;Prieto Curiel et al., 2021;Wang et al., 2021a). Thus, one future research direction would be to use empirical studies to detect different paradoxes in a multimodal transport network. (3) Once the paradox is observed, it is crucial to develop a method to avoid it. Bazzan and Klügl (2005) noted that this could be avoided by providing route recommendations. With the emerging personalised mobility, we propose exploring personalised route guidance (e.g., Ceder and Jiang, 2020;Jiang and Ceder, 2021). (4) The lower-level developed in this study relies on generating a path choice set in advance. A large path set may impede the application for a large network and the choice set could affect the solution, as discussed in the traditional stochastic user equilibrium model (e.g., Prashker and Bekhor, 2004;Bekhor and Toledo, 2005). Developing a link-based or approachbased assignment model would be a future research avenue to overcome these limitations (e.g., Szeto and Jiang, 2014a;Jiang, 2021). Meanwhile, it is also possible to consider the land use effect in lower-level model (Zhong et al., 2021a(Zhong et al., , 2021b. (5) Regarding demand elasticity and trip distribution, this study adopts a simplified function to represent the destination's attractiveness. One way to improve this would be to incorporate more advanced land-use and transportation interaction models (e.g., Zhong et al., 2021aZhong et al., , 2021b.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. which implies that the passengers' route choice is in line with the multinomial logit distribution.

Travellers' mode choice follows multinomial logit distribution
We consider the KKT conditions with respect to variable q m+ rs , which is which implies that the passengers' mode choice is in line with the multinomial logit distribution.

Travellers' destination choice follows multinomial logit distribution
The procedure is similar to the preceding poofs, and we refer to Yang et al. (2000) for more details.