Dynamic segment shared protection algorithm for reliable wavelength-division-multiplexing mesh networks

With the maturation of the technology of Wavelength-DivisionMultiplexing (WDM) in optical networks, the survivable design has become a key issue. In this paper, we propose a Segment Shared Protection Algorithm (SSPA), which is based on the reliability of the networks and the different levels of the fault tolerance requested by the users, to protect the single-link failure in WDM optical networks. The main idea of the SSPA is to provide a backup path for a segment, which is divided in accordance with the policy of the Differentiated Reliability (DiR), on the primary path of each connection request. Under the guarantee of the blocking probability and the connection’s reliability, the SSPA has higher resource utilization ratio and faster recovery time than the previous algorithm PSPA-DiR. We evaluate the effectiveness of the SSPA and the results are found to be promising. ©2005 Optical Society of America OCIS codes: (060.4510) Optical communications; (060.4250) Networks. References and links 1. S. Ramamurthy, L.Sahasrabuddhe, B. Mukherjee, “Survivable WDM mesh networks,” J. Lightwave Technol. 21, 870-883 (2003). 2. Y. Xiong, D. Xu, C. Qiao, “Achieving fast and bandwidth-efficient shared-path protection,” J. Lightwave Technol . 21, 365-371 (2003). 3. C. Saradhi, M. Gurusamy, L. Zhou, “Differentiated QoS for survivable WDM optical networks,” IEEE Commun. Mag. 42, 8-14 (2004). 4. C. V. Saradhi, C. S. R. Murthy, “Routing differentiated reliable connections in WDM optical networks,” Opt. Net. Mag. 3, 50–67 (2002). 5. L. Guo, H. Yu, L. Li, “Joint routing-selection algorithm for a shared path with differentiated reliability in survivable wavelength-division-multiplexing mesh networks,” Opt. Express. 12, 2327-2337 (2004), http://www.opticsexpress.org/abstract.cfm?URI=OPEX-12-11-2327 6. J. Zhang, B. Mukherjee, “A Review of fault management in WDM mesh networks: basic concepts and research challenges,” IEEE Network. 18, 41-48 (2004). 7. P. H. Ho, J. Tapolcai, T. Cinkler, “Segment shared protection in mesh communications networks with bandwidth guaranteed tunnels,” IEEE/ACM Tran.Networking. 12, 1105-1118 (2004). 8. D. Xu, Y. Xiong, C. Qiao, “Novel algorithms for shared segment protection,” IEEE JSAC. 21, 1320-1331 (2003). 9. L. Guo, H. Yu, and L. Li, “A new shared-path protection algorithm under shared-risk link group constraints for survivable WDM mesh networks,” Opt. Commun. 246, 285-295 (2005). 10. Pin-Han Ho, Hussein T. Mouftah, “A Novel Survivable Routing Algorithm for Shared Segment Protection in Mesh WDM Networks With Partial Wavelength Conversion,” IEEE JSAC. 22, 1548-1560 (2004). 11. L. Guo, H. Yu, and L. Li, “Path protection algorithm with trade-off ability for survivable wavelengthdivision-multiplexing mesh networks,” Opt. Express. 12, 5834-5839 (2004), http://www.opticsexpress.org/abstract.cfm?URI=OPEX-12-24-5834 12. L. Guo, H. Yu, and L. Li, “A new path protection algorithm for meshed survivable wavelength-divisionmultiplexing networks,” Lecture Notes in Computer Science. 3420, 68-75 (2005). (C) 2005 OSA 18 April 2005 / Vol. 13, No. 8 / OPTICS EXPRESS 3087 #6727 $15.00 US Received 28 February 2005; revised 6 April 2005; accepted 6 April 2005


Introduction
In Wavelength-Division-Multiplexing (WDM) optical networks, a fiber link can provide tremendous bandwidth, so a fiber link cut can lead to a lot of light-paths failures and the consequence is gigantic data loss.Therefore, the protection design is very important for the fault management in survivable WDM optical networks.
Some previous papers [1,2,9] have proposed the protection algorithm, called Path Shared Protection Algorithm (PSPA), which has more efficient resource utilization than other protection algorithms.PSPA computes a primary path and a backup path for each connection request, so that the single-link failure that is dominant in WDM optical networks can be tolerated completely.It is obvious that the PSPA does not consider the requisitions of the users with Differentiated Reliability (DiR), and it only provides 100% reliability to tolerate the single-link failure.In fact, some users only need lower connection reliability, such as 98%, 96%, and so on [3,4].Then, for differentiated reliable WDM optical networks, the resource utilization and the blocking ratio of the PSPA will not be promised [5].
Recently, some papers [3][4][5][6][7] studied the Differentiated Reliability (DiR) and presented the new algorithm that is called Path Shared Protection Algorithm with DiR (PSPA-DiR) for WDM optical networks.PSPA-DiR first computes a primary path for each connection request, and second computes the reliability of the primary path.If the reliability of the primary path satisfies users' requirement, then we do not need to compute the backup path; otherwise, backup path should be computed.From the simulation results in [3][4][5], we can find out that the PSPA-DiR has higher resource utilization ratio and lower blocking ratio than the PSPA.We also can see that, in PSPA-DiR, if the reliability of the primary path is less than users' requirement, then a backup path will be assigned to protect the full primary path not the partial primary path (namely, the segment path).If we consider protecting the segment path, which can be divided according to the policy of the DiR, then the backup resources would be reduced and the resource utilization ratio would be further improved.Another advantage of the segment protection is that the length of the segment and the corresponding backup path is shorter than the length of the full primary path and the corresponding backup path, then when the failures occur, the recovery time of the segment protection will be shorter than the path protection [10].
So in this paper, based on the DiR [3-6] and the segment path's concept [7,10], we propose a Segment Shared Protection Algorithm (SSPA) to protect the single-link failure according to users' requirements.SSPA first computes a least-cost primary path for each connection request and check the reliability of the primary path.If the reliability of the primary path doesn't satisfy users' requirements, then the backup path need not to be assigned; otherwise, SSPA divides the primary path into two segments in accordance with the reliability of the connection and only computes a backup path for a segment.In order to share the backup reserved wavelengths, if two segment paths are link-disjoint, then their corresponding backup paths can share the common reserved wavelengths.In the simulation results, we can see that the SSPA performs higher resource utilization ratio and shorter recovery time than the previous PSPA-DiR.
The rests of the paper organize as follows: Section 2 formally states the analysis of reliability, the segment protection with the connection's reliability, the link-cost function assignment, and the failure's recovery time.Section 3 presents the network model and the process of the SSPA.Section 4 evaluates the performance of the SSPA.Section 5 is for conclusion.

Analysis of reliablility
Fault tolerance refers to the ability of the networks to restore the connection's traffics after the failures.In this paper, the notion of Reliability is the probability that a system or connection will operate correctly in a period of time.Reliability has a range from 0 (worst) to 1 (perfect).Fiber reliability is determined by many environment factors (e.g., temperature, earthquake, humidity) and man-made factors (e.g., dredges up, fires).The reliability of a fiber link (i,j) is denoted as R (i,j) .At the beginning period of the foundation of the network, R (i,j) can be determined by the fiber component manufacturers.After several years, R (i,j) can be estimated by the failure rate based on past experience.The reliability of a light-path p, denoted as R p , is the multiplication of the reliabilities of these fiber links that are traversed by the light-path p.In Fig. 1, the primary path P p is 1-2-3-4-5.Each fiber link has its own reliability: R (i,i+1) , (i=1,2,3,4).We assume that the failures of the fiber links are independent.We can calculate the reliability of the primary path as R p = ΠR (i,i+1) (i=1,2,3,4).If R p is less than the reliability of users' requirement, then for the PSPA-DiR, we can choose 1-6-7-8-5 as the backup path (denoted as P b ) for the primary path, and the reliability of the backup path can be calculated as (8,5) .Then, the R c that denotes the reliability of the connection can be calculated as (1) For the SSPA, we divide the primary path into two un-overlapped segments (how to divide the primary path will be discussed in Section 3): the first segment which will not be protected is denoted as P us and its reliability is denoted as R us ; the second segment which will be protected is denoted as P ps and its reliability is denoted as R ps .For the SSPA, we only compute a backup path for the second segment P ps .In Fig. 1, the P p is 1-2-3-4-5, the P us is 1-2-3, the P ps is 3-4-5, and the backup path that is denoted as P bs for the P ps is 3-9-5.Then, the reliability of the connection is calculated as Eq. ( 2), where R c = (reliability of the first segment)×(joint reliability of the second segment and its backup path).
) ) 1 ( ( (2) Let R r denote the reliability of users' requirement.Assume the reliability of each fiber link is 0.98 and the R r = 0.95.In Fig. 1, for the PSPA-DiR, we choose 1-6-7-8-5 as the backup path and get the R c =0.99393 according to Eq. (1).For the SSPA, we choose 3-9-5 as the backup path to protect the second segment and get the R c =0.95889 according to Eq. (2).It is obvious that the two protection mechanisms both satisfy the required reliability because both R c are greater than R r (=0.95).We can also see that, in Fig. 1, the second segment's backup path is shorter than the primary's backup path, and the SSPA uses less backup wavelengths (2 wavelengths for the P bs ) than the PSPA-DiR (4 wavelengths for the P b ), so that the SSPA has better resource utilization than the PSPA-DiR.

Cost function
An advantage of shared protection lies in a larger degree of wavelength resource sharing.In this section, we will define the cost function for computing the primary and backup paths in SSPA to implement the wavelengths sharing.
Assume the network is G(N,E), where N and E are the sets of nodes and bi-directional fiber links, respectively.The capacity on link i(∈ E) can be categorized into the following three types: (1) Free capacity, denoted as f i , which are the free capacities that can be used by the following primary or backup paths.
(2) Reserved capacity, denoted as RC i , which are the reserved capacities by some backup paths.
(3) Working capacity, denoted as W i , which are the working capacities taken by some primary paths and can not be used for any other purpose until the corresponding primary path is released.The cost function cw' for finding a primary path (P P ) with requested bandwidth (RB) is calculated as where c i and R Li are the basic cost and reliability of link i, respectively; k is a parameter that can adjust the tradeoff between the basic cost and reliability along the chosen path (we set k=1 in the later simulations).The reason that we introduce the link reliability into the link cost function is to find a less cost and higher-reliable primary path simultaneously.We can see from Eq. (3) that, those links which have higher reliabilities will have less link cost.If the primary paths traverse those links, it is more likely that the reliability of the primary paths are greater than users' requirement and we do not need to compute the backup paths.Thus, the reserved wavelength resources will be saved.

Working capacity
Free capacity Sharable None-sharable

Fig. 2.Illustration of capacity along link i( ∈ E)
For finding a backup path (P bs ) for the protected segment (P ps ) on the primary path, we should first define the corresponding cost function.With the requested bandwidth(RB) and the found primary path(P p ), the reserved capacity(RC i ) along link i can be further divided into two types: (1) Sharable capacity, denoted as sh i , which is the capacities reserved by some protected segment(s) and is shared by P ps , where the "some protected segment(s)" should be link-disjoint with P ps .
(2) Non-sharable capacity, denoted as none-sh i , which is the capacities reserved by some protected segment(s) and is not shared by P ps , where the "some protected segment(s)" should not be link-disjoint with P ps .It is obvious that RC i = sh i + none-sh i .Figure 2 illustrates the capacity along link i.
The cost function cb' for finding a backup path (P bs ) for protected segment (P ps ) with requested bandwidth (RB) is calculated as where ε is a sufficient small positive constant, such as 0.001 or 0.0001, and so on; a is a parameter that is a positive constant (we set ε=0.001 and a=1 in the later simulations).In our algorithm, the backup path will first take the sharable capacity on a link if there is enough sharable capacity available on the link.If there is enough sharable capacities to cover the requested bandwidth (RB), then the sharable capacities would be reserved for the backup path of the new request connection and we do not need to allocate new wavelength, so we let the cost of link i to be a sufficient small positive constant ε.If there isn't enough sharable capacity on the link, then some free capacities will be taken and the link cost in this situation will be determined by how many free capacities will be taken.If the summation of the sharable and free capacities is less than the RB, then the link is unavailable for the backup path, so we let the link cost to be infinite.Therefore, we can see from Eq. (4) that, those links which already have enough reserved capacities will have less link cost.If the backup paths traverse these links, then we do not need to reserve new backup wavelengths.Thus, the resource utilization ratio will be improved.

Recovery time
Another advantage of segment protection compared with path protection is the faster recovery time after the failures.In shared protection (SSPA, PSPA, and PSPA-DiR) schemes, the reserved backup resources can only be reserved but not configured before the failures occur.In this paper, we focus on the situation that there is only a single-link failure in the network.For SSPA, if the failure occurs on the segment primary path (P us ) that is not covered by the backup path, then the connection can't be restored, so a new primary path should be computed to fix the problem.The following discussion is under the assumption that the failure occurs on the segment primary path (P ps ) protected by the backup path.The procedure of recovery for SSPA is briefly described as follows: after a link failure, the downstream node of the failed link detects the failure immediately and sends a notification indicator signal (NIS) to notify the beginning and end nodes of the protected segment (P ps ), then the beginning node of the P ps sends a wake-up signal (WUS) to activate the configuration of the backup path (P bs ).The configuration process along P bs can be conducted in a pipeline manner.At last, the working traffic will be switched over to the backup path.
In Fig. 3(a), we assume the link (4,5) fails.The downstream node 5(or 4) detect the failure, then node 4 propagates a notification indicator signal (NIS) to the node 3.After the node 3 receiving the NIS, a wake-up signal (WUS) is sent to all the nodes along the backup path 3-8-9-10-6 and these nodes will be configured.The end node 6 of the protected segment is also informed to start receiving the traffics along the backup path.Finally, node 3 switches the traffics to the backup path and the recovery time is over.
Figure 3(b) illustrates the situation in path-shared protection.The difference from segment shared protection is that the downstream node of the failed link sends the NIS signal to the beginning node of the primary path.
The recovery time depends mainly on the failure detection time δ that is assumed as 10us, the message-processing time p at a node that is assumed as 20 us, the signal propagation time s that depends on the distance of the links the signal travels, and the configuration delay ε at all nodes along the backup path that is assumed as 5 ms.So the recovery time T r is calculated as: ) ( This paper investigates the impact of routing strategy on recovery time, so the signal propagation time s, which is directly proportional to the physical distance of segment (or primary path) and the corresponding backup path, is the main contribution to the recovery time T r .It is obvious that, in Fig. 3, the length of the segment and the corresponding backup path is shorter than the length of the full primary path and the corresponding backup path, then after the failures, the recovery time of the segment protection will be faster than the path protection according to Eq. ( 5) and Eq. ( 6).

Network model
We define the network topology G(N,L,W) for a given meshed WDM optical network, where N is the set of nodes, L is the set of bi-directional fiber links (we suppose each link has two opposite fibers), W is the set of wavelengths on a fiber.A connection request's arrival is dynamically and only one connection arrives at a time.We assume each requested bandwidth is a wavelength and allow the full wavelength convertible capacity for each node.Before introducing the process of SSPA, we present the following symbols: L i : the fiber link i, L i ∈ L. N j : the node j, N j ∈ N.
i L R : the reliability of the L i .It can be determined by a long time experience as we mentioned in Section 2. We suppose that the reliability of each fiber link is independent.
c i : the basic cost of the link i.It is determined by physical length of the fiber link, the expense of the installation of the fiber link, and so on.
cw' i , cb' i : the dynamic cost of the link i for computing primary path and backup path respectively.They are determined by the basic cost of the link and the current network's state.
R (s,d) : the connection request form source node s to destination node d.R r : the reliability of the connection requested by the applications/users.P p : the primary path for the connection request R (s,d) .P b : the backup path for the P p .P b and P p should be link-disjoint.
List(L j ) for P p : the set of all fiber links L j that are traversed by P p .We suppose there are n fiber links traversed by P p , and then j=1, 2, 3…n.
R p : the reliability of the primary path P p , calculated as M: the "mid-node" on P p .It can divide the P p into two segments (unprotected-segment and protected-segment).
P us , P ps : the unprotected segment and the protected segment on P p .They are separated by M. P bs : the backup path for the P ps .P ps and P bs should be link-disjoint.R c: the connection's reliability that is calculated as Eq. ( 2).

The process of SSPA
Step 1: Waiting for a request.If the request is for establishing a connection, go to Step2.Else, if the request is for releasing a connection, then update the network's state and go back to Step 1.
Step 2: Adjust the link cost cw' i (for all L i ∈ L ) according to Eq. ( 3) and compute the minimalcost path P p from node s to node d with the Dijkstra's algorithm.If fail to find the P p , then reject the request and go back to Step 1. Else, compute the reliability R p for the P p .
If R p ≥ R r , then we do not need to assign the backup path for the P p .Accept the request, allocate resources for P p , and go back to Step 1. Else, if R p < R r , then we need to assign the backup path for the P p .Record List(L j ) for P p and go to Step 3.
Step 3: According to Eq. ( 8), find the maximal value of the m.defined as β/µ=β.The request will be rejected immediately if the algorithm can't find the proper paths, so there is not waiting queues.The bandwidth of a request is a wavelength.The reliability of each fiber link is uniformly distributed between 0.97 and 0.99.The total number of connection requests is 10 6 .We adopt the PSPA algorithm [4] and PSPA-DiR algorithm [7] for a comparison.
In accordance with [9,12], we introduce the BRPC (backup resources per connection) and the blocking probability (BP) to evaluate the performances.We also compare the recovery time (RT) as we discussed in Section II for the three algorithms.
A smaller BRPC means that we need to assign fewer wavelengths.It also means that there are fewer backup wavelengths in reserving along all the backup paths, that is, a higher resource utilization ratio or a higher degree of reserved resources sharing.A higher resource utilization ratio will lead to lower traffic blocking probability because the following requests can use more free wavelengths.In Fig. 5 (a-d), it is obvious that, in both test topologies, the SSPA performs lower blocking probability and higher resource utilization ratio than the other algorithms.The reason (see analysis in subsection 2.1) for this is that, the SSPA provides partial protection for the segment of primary path according to the differentiated reliability of the users' requirements, and the reserved backup wavelengths are less than that of the PSPA and the PSPA-DiR, namely, more free wavelengths can be assigned to the following traffic routing, so the BP of the SSPA is lower.We can also see that the BP in a larger network (N-21) is lower than that in a smaller network (N-17) in all three schemes, because in a larger network more routes and wavelengths can be selected for the coming requests.
In Fig. 5(e) and (f), we can see that, in both test topologies, the SSPA performs faster recovery time than the PSPA-DiR and PSPA.The reason (see analysis in Subsection 2.3) for this is that, the length of the segment and the corresponding backup path is shorter than the length of the full primary path and the corresponding backup path, then after the failures, the recovery time of the segment protection will be shorter than the path protection according to Eq. ( 5) and Eq. ( 6).
In Fig. 5(a-d), we can see that, when the R r increases, the resource utilization ratio will become lower and the blocking probability will become higher.Because when R r becomes larger, more primary paths need to be assigned backup paths and more wavelengths will be reserved, and then the resource utilization ratio will become lower.Lower resource utilization ratio will lead to higher blocking probability because less free wavelengths can be assigned to the following requests.
In Fig. 5(e) and (f), when the R r increases, the recovery time of SSPA will become longer.The reason for this is that, if the R r becomes larger, then we will compute a longer protected segment according to the Step 4 in the process of SSPA, and longer protected segment will lead to longer backup path.Thus, the length of the segment and the corresponding backup path will become larger, and this will lead to longer recovery time according to Eq. ( 5) and Eq. ( 6).
We can thus conclude that, with differentiated reliability of the users' requirements, the SSPA has higher resource utilization ratio, lower blocking probability, and faster recovery time than the previous algorithms PSPA-DiR.

Conclusion
In this paper, we propose a algorithm, called Segment Shared Protection Algorithm (SSPA), for differentiated reliable WDM mesh networks.The main idea of the SSPA is to provide a backup for a segment on the primary path and guarantee the reliability of users' requirements.The simulation results shows that, with differentiated reliability of the users' requirements, our algorithm SSPA has higher resource utilization ratio, lower blocking probability, and faster recovery time than the previous algorithms PSPA-DiR.

Fig. 1 .
Fig. 1.An illustration of path protection and segment protection Fig. 3. Process of Recovery Time where n ps and n b are the number of nodes traveled by NIS and WUS, respectively.The signal propagation time s is calculated as and d b are the physical distance of fiber links traveled by NIS and WUS, respectively; u is the speed of light traveling in the fiber (u = 2×10 8 m/s).