Quality of Protection on WDM networks A Recovery Probability based approach

WDM networks survivability needs a ﬂexible quality of protection (QoP) due to the variety of existing connection demands, lack of fair resource distributions under traditional QoP and optimal resources assignment. Thus, this paper proposes a new paradigm of QoP based on generic levels of protection where the set of protection levels can be de-ﬁned as the network administrator needs, i.e., a ﬂexible QoP approach whose particular case is the traditional or non-ﬂexible QoP approach. Essentially, the proposed generic level is based on the recovery probability concept which measures the grade of conﬂict among primary lightpaths that share backup lighpaths for link failure recovery. In order to study how this strategy impacts on the network cost, a Genetic Algorithm is proposed. It calculates the primary and backup lightpaths, considering a multi-objective optimization on the basis of lexicographical sorting approach. The Genetic Algorithm minimizes the number of blocked requests, the number of services without protection, the total diﬀerences between the requested QoP and the assigned QoP, and the network cost; all of which by considering the optical ﬁbers used and subject to the wavelength usage as constraint. The experimental results indicate that the proposed approach – ﬂexible QoP– is a promising strategy where the network cost, the number of requests and QoP levels are contradictory objective functions in environments with homogeneous and heterogeneous QoP requirements.


Introduction
Optical networks based on WDM (Wavelength-Division Multiplexing) technology are the main way for high performing communication [1,2,3,4,5,6,7,8,9].WDM networks have associated three fundamental aspects: design, administration, and survivability [1,10].Within these aspects a key point is the service types which can be determined by: the quantity of sources and destinations, the quality of service (QoS), the failure types, the recovery schemes, among others [1,3,11,12,13].Basic QoS classifications have been reported by several studies [2,4,8,9,14] without considering fairness distribution and optimal resources allocation.However, there is a need to use the QoS requirement to ensure the communication recovery.This is called the quality of protection (QoP) [4].Another critical aspect is the services diversity growing with QoP requirement, which cannot find quick answer with the traditional scheme of QoP classification [2,4,8,9,14].As a result, a new paradigm QoP must be developed in order to meet the diversity of QoP requirements in a simple and easy way.On this basis, the main contribution of this work proposes a flexible QoP strategy based on recovery probability for simple link failure scenarios.Firstly, this approach is compared with traditional QoP strategies (non-flexible schemes) in the routing and wavelength assignments with protection against simple link failures (RWA-PLF) problem as a case of study.Through this simple experiment we show the benefit of the QoP flexible scheme proposed from the user perspectives.Secondly, in order to study the behavior of routing cost when parameters of requests and QoP levels change, a Genetic Algorithm inspired in [2] is proposed to solve the RWA-PLF problem, considering the flexible QoP strategy and several scenarios.Basically, we want to optimize the cost of primary and backup lightpaths of a set of unicast requests, subject to its QoP requirements and a pre-defined set of QoP levels.
This document is organized as follows: Section 2 briefly presents failure management schemes.Section 3 introduces concepts of QoS and QoP.The proposal of flexible QoP is given in Section 4 while the proposed Genetic Algorithm is exposed in Section 5. Section 6 presents and discusses experimental results, and finally, in Section 7 our conclusions and future work are given.

Failures Management Schemes
Literature reports various failure types: node and link failures, and shared-risk resource group [15].This work focuses on link failures, considering a single point of failure or a simple failure.Generally, WDM networks are vulnerable to link failure with catastrophic consequences since large volume of traffic is lost in a very short time.These link failures happen due to fiber cuts and hardware device failures such as: receivers, laser, amplifiers, routers and converters.The reports indicate that one kilometer of cable is damaged after 228 years (4.39 cuts/year/1000 kilometers) [12].Therefore, WDM networks survivability is a critical issue.Consequently, networks survival strategies have been developed defining alternative paths in case of failures.Under these circumstances, various static and dynamic protection paradigms have been proposed [1,16].The following classification is primary for this work: 1. Survival techniques indicate the moment in which the backup path is calculated.
(a) Protection-oriented survival [1,2]: primary and backup paths are calculated in the design time.
Two approaches can be identified according the shared resource type: i. Dedicated protection (1+1): the backup path's assigned resources cannot be used on other demands.ii.Shared protection (M:N): M backup paths protect N primary paths; i.e. the resources of backup paths can be used by other primary services.
(b) Restoration-oriented survival: The backup path is calculated after the primary path failures without warranty; i.e. the best effort.
2. Survivability path section defines the sections of the primary path to be protected in a way independently.
(a) Path-oriented protection: a backup path protects the complete primary path.
(b) Link-oriented protection: a backup path protects a link of the primary path.There will be as many backup paths as links in the primary path.
(c) Segment-oriented protection: a backup path protects a segment of primary path; therefore, there will be as many backup paths as pre-defined segments in primary path.
Let us note that link and path-oriented protections are particular cases of segment-oriented protection.Protection schemes change their structures depending on whether the traffic is unicast or multicast.A good classification can be seen in [17].Another aspect to consider is how to evaluate the performance of networks survivability.In [18], resources efficient usage, failure recovery time, and complexity of structure are network survivability measures.

Quality of Protection
For an Internet service provider, the cost depends on the physical layout and the network operation.There are different rates according to QoS level.These rates are sensible to data volume, time (day/night), priority, delay, data loss, and other factors called QoS parameters [1].
Huang et al. [7] present the routing and protection problem with QoS requirements as constraints.They divide the network QoS metrics into two categories: additive and bottleneck.For example, the path delay is given by summing up the delays of all the links on the path and it is called additive metric.Another example is the path loss rate, which is the product of the loss rate of all the links on this path.However, we can use the logarithm of loss rate to denote the loss rate.In that case, we can also consider the loss rate as an additive metric.On the other hand, the bandwidth is a bottleneck metric, since the bandwidth of a path is the minimum of the bandwidth of all links on the path.
Various works [2,4,8,9,14] have reported basic QoP classifications.Gerstel et al. [8] suggest Quality of Protection (QoP) paradigm where protected-dedicated and unprotected-dedicated primary paths are considered.The concept of QoP is proposed for the first time in [8] as an extension of QoS, but they do not define how to quantify the QoP.
Zhong et al. [14] present a basic QoP proposal based on lineal programming to establish paths in a WDM network dynamically.It defines QoP levels based on the exclusivity of assigned paths: • Level 2: Link-disjoint primary and backup paths.Both are dedicated and pre-calculated (protectionoriented survival).
• Level 1: Link-disjoint primary and backup paths.Just the primary path is dedicated and precalculated; if a primary path failures, a backup path is calculated (restoration-oriented survival).
• Level 0: Only non-dedicated primary paths without protection are provided; i.e. primary path can be preempted by other paths.
At this point, it is important to highlight the conclusion stated in [14]: the allowance of multiple levels of service leads to higher performance in terms of resource use and reduces the number of blocked connections.
Similarly, Kim et al. [19] indicate that 5-levels QoP solutions let up to 30% in cost saving in comparison to solutions without QoP levels; and up to 8% by using just 2 levels instead of 5. Two key aspects are observed in that work: (a) the importance of multiple QoP levels; and (b) the optimal determination number of QoP levels.
• Reroutable Class (Class C): The shortest path is assigned as primary without protection; however, when a failure happens, then a backup path is calculated if it were possible.
• Unprotected Class (Class D): The shortest path is assigned as primary without protection and rerouted.
• Preemptable Class (Class E): Resources of primary path can be preempted by other paths.
Nuñez et al. [2] proposed a 3-levels QoP scheme (gold, silver, and bronze), according to the exclusivity of protection resources: • Gold level (1+1): Link-disjoints primary and backup paths, both are dedicated and pre-calculated.
• Silver level (1:1): Link-disjoints primary and backup paths.Primary path is dedicated and precalculated, and backup path is shared and pre-calculated.In case of primary path failures, the backup is used.
• Bronze level: Only a pre-calculated and non-dedicated primary path is provided.
Pujo et al. [9] propose a QoP based on the delay differentiation, in which many parameters are reported to be considered.Assigning superior priority to certain traffic degrades the capabilities of the inferior priority traffic.
On the basic of works [8,14,19,2,9] the concept of QoP in this work is extended to fixed levels to flexible QoP for WDM networks protection, considering dedicated-shared protection schemes.In this context, each exposed work is a particular case of the flexible QoP level strategy that will be presented in the next section.

Proposal
This work is developed in the field of WDM networks where the optical channels can contain multiple optical fibers and wavelengths.Furthermore, we suppose that each network node has full capacity of wavelength conversion.This proposal focuses on the RWA-PLF problem, where routing algorithms, wavelength assignment strategies and protection schemes are keys.In this context, the following features are considered: (1) unicast requests; (2) static traffic; (3) wavelength usage constraint; (4) full capacity of wavelength conversion on all nodes; (5) simple-link failures; (6) segment-oriented protection scheme; and (7) protection-oriented survival.Firstly, to understand the flexible QoP concept, we need to define the symbols used.

Symbolic Representation
For this work, a WDM network is modeled as a graph G = (V, E, Λ), where V is the set of nodes, E is the set of links, and Λ is the set of available wavelengths on each optical link belonging to E. Be: e = (i, j) optical link from node i to node j; where i, j ∈ V and e ∈ E; Λ e set of wavelengths on link e; c eλ optical channel on link e and wavelength λ ∈ Λ e , c eλ = (i, j, λ); Q set of QoP levels provided by the network; Q = {q 0 , q 1 , ..., q |Q|−1

QoP flexible in WDM networks
The proposed strategy is to determine generic QoP levels by means of the recovery measure P su of the primary lightpaths.This strategy only considers protection-oriented survival.Generically, the set of levels is as follows: 1. Priority service: exclusive primary and backup lightpaths (1+1).Maximum recovery measure P su = q 0 = 100 % (dedicated protection).
2. Protected service with M levels: exclusive primary lightpath and shared backup lightpath with recovery measure P su ≥ q i = M % where i ∈ {1, 2, ..., |Q| − 2}.Each link of the backup lightpath can be shared with other backup lightpaths.In this context, the shared setting 1:N is a special case where N backups use the same lightpaths (shared protection).
3. Preemptible service: primary lightpaths which components can be used by other backup lightpaths, it is clear that recovery measure for this service is P su = q |Q|−1 = 0%.
The Table 1 shows the comparison among the state-of-the-art approaches and our proposal.This comparison is a bit complex because protection levels and restoration are handled at the same time.The highest priority service has dedicated protection as shown in row 1 (Table 1).
Different shared protection levels are listed in rows 2, 3, and 5. Row 4 (for Proposals 1 and 3) shows services without protection but with restoration.While in row 6 the lowest priority level is presented.The maximum QoP level (based on the scheme 1+1) can be observed in the intersection of the flexible QoP column and row 1.Finally, in the rows 2 to 6 are presented QoP levels ranging from 0% (unprotected and preempted) to 100% (1:1) of warranty recovery.This work focuses on the network survival design phase.Therefore, only protection schemes are considered.Note that, there might be as many QoP levels as needed in comparison to fixed classes in proposals 1, 2, and 3 [2,14,19].Note that, each QoP class defined in [2,14,19] are particular cases of this paper's proposal.
Another worthy important aspect to mention is that the works [2,14] and [19] only consider path-oriented protection schemes, while in this work a segment-oriented protection scheme is applied, being more efficient than any other one.
Finally, with this flexible scheme of QoP levels, the following essential points are achieved: 1. Flexibility is given to the owners of the networks to create as many levels of services as they consider appropriate and not just fixed schemes like proposals 1, 2, and 3 (Table 1).

Link Recovery Probability
Before evaluating the recoverability of a service s u , the recovery probability P su,ē r of a link e ∈ p u must be evaluated.This probability is calculated with respect to the number of services that share protection resources; therefore, the following definition is given: e is the number of services s v ∈ S − {s u } that share protection resources when a link e is used simultaneously by primary lightpaths p u and p v with v ∈ U − {u}.

The range of values of
Then, η su e can be calculated as follows: where: In this context, the link recovery probability P su,ē r of a primary path p u , in case the link e fails can be calculated as: Note that, if a service s u does not conflict with other services when the link e fails then η su e = 0 and the recovery probability is P su,ē r = 1 (100%).Conversely, at higher conflicts for the resources, the probability tends to 0, given that η su e increases to |S| − 1.Furthermore, in (3) we suppose that all competitors are equally likely to take protection resource; i.e., there is no priority.

Service recovery measurement
The recovery measure of a service can be computed with the expressions (2) and ( 3).Now we proceed to calculate the recovery measure P su .

Definition 2
The recovery measure of a service P su is the average of link recovery probability P su,ē r .
Formally, P su is set to: where |p u | is the number of optical links in the path p u .The expression ( 4) is the measure used in this work to define the QoP levels of a service when a link fails.At this stage, it is important to consider that the criterion ( 4) is a valid measure as well as it can be the maximum and minimum link recovery probability and other criteria.

Mathematical Formulation
Using the above definitions, the RWA-PLF with QoP requirement problem can be treated in a multi-objective optimization context based on lexicography approach.Given a network topology G, a set of requests U , a vector of links cost C, and a set of available services Q, this problem aims to calculate the solution S that minimizes the number of blocked requests f 1 , the amount of services without protection f 2 , the difference in levels between the requested and obtained QoP f 3 , and the routing cost f 4 .The last objective function (f 4 ) is the network cost considering used optical fibres in primary and backup lightpaths.The problem is stated as follows: such that: where: subject to the wavelength usage constraint: Note that, the constraint (10) implies that σ pu eλ + σ bu eλ ≤ 1 and defines whether a link e is used by a primary or backup of a request u.Furthermore, the constraint (10) indicates that each c eλ is used in just one ligthpath.This is important because conflicts arise among services with shared lightpaths that can not simultaneously use an optical channel.Apply binary tournament to Ω g , obtaining Ω selectedg ;

Optimizer Algorithm
An elitist Genetic Algorithm [20] named Flexible Levels Service Algorithm (FLSA) is proposed as a design tool.The FLSA is coded in Algorithm 1, while the evaluation of a solution is outlined in the Algorithm 2.
It is important to note that the four objective functions are evaluated in Algorithm 2. The best solution is update in line 7 of Algorithm 1, which was calculated in generation g.This updating is made considering the lexicographical approach [21] according the procedure given by Algorithm 3.
Note that, the objective functions f 1 , f 2 , and f 3 are usually considered as constraints [22], but in this implementation they are considered as objective functions together with the routing cost f 4 to give flexibility to the search of FLSA algorithm.
Chromosome.The proposed chromosome can be seen in Figure 1. S u represents the chromosome and s ui is the i-th gene.Each gene encodes the solution for each unicast request u i .Figure 2a presents an sample solution in the NSF network with a single request u = (1, 14, P riority(1 + 1)).Initial Population.Initial population Ω 0 is randomly generated to obtain diversity at the beginning of the process.Each p u and b u is made by selecting links and wavelengths randomly.Another important process is that the initial population is looking to get local information by two extreme solutions: (a) one solution is made up of the first and second shortest paths (path-oriented protection); and (b) in the other solution, the primary path is the shortest path while the backup path is made up of the shortest paths that protect each link of primary path (link-oriented protection).end if 17: end for 18: Return S; Selection.The selection operator in this work is the binary tournament [21].Given two individuals that were selected at random, the best solution of them is taken as father.The comparison is performed lexicographically over the objective functions in the Algorithm 2.
Crossover.The crossover operator is performed on two parent chromosomes which were returned by the selection operator.An example of crossover can be seen in Figures 3a and 3b. Figure 3b shows the steps of crossing, in Step 1 the common nodes of both parents are selected.Step 2 applies the shortest paths computation for each pair of nodes.Step 3 calculates the shortest path from source to destination nodes as backup path.A null gene indicates path not found.
Mutation.This operator works with an occurrence probability of p m , where an individual is replaced by another new individual that is randomly generated.

Experimental Results
The proposed algorithm FLSA was implemented on a PC Intel i7-3630QM 2.40 GHz (x 8 threads) with 8GB of RAM, Javac Compiler 1.7.0.Several sets of unicast demands U were selected for the NSF network.Each optical fiber has the same number of wavelengths.Basically, three types of experiments are conducted using the FLSA algorithm with the evolutionary parameters given in Table 2.The objective of this test is to show the unfairness that happens when the protection is based on shared resources.In this context, the recovery measure P su is calculated in such a way to be able to measure this value for all solutions, considering the requests of the Table 3.
Algorithm FLSA was run once and the best solution was recorded.The primary and backup lightpaths of this solution are presented in Figures 4a, 4b, 4c, and 4d while their recovery measure P su are listed in Table 3.
The results show how different services with the same QoP requirement (q = 0%) obtain different recovery measure values (Table 3).From practical viewpoint, all customers pay the same cost but they receive different service qualities.Conflicts over protection resources can be seen with the requests u2, u3, and u4 corresponding to the services s u2 = (p u2 , b u2 ), s u3 = (p u3 , b u3 ), and s u4 = (p u4 , b u4 ), since they have a P su < 100%.
To understand the calculation of P su we can see the service s u3 in Figure 4c.Then s u3 competes with the service s u4 sketched in Figure 4d.Firstly, the primary lightpaths p u3 and p u4 use optical link e = (12, 10); and secondly, the backup optical channel (5, 3, 1) is assigned to lightpaths b u3 and b u4 simultaneously.In  this context, the calculation of the recovery measure of the service s u3 is as follows: |{ (12,10), (10,9), (9,13), (13,14), (14,8), (8,3) Applying the same procedure, P su2 and P su4 can also be calculated.

Test 2: RWA-PLF with homogeneous QoP requirements
Basically, the cost function will be studied considering P su ≥ q k (q k ∈ Q, ∀u).Four sets of unicast requests U n are generated, with (n =) 10, 20, 30 and 40 requests, respectively.All requests have the same QoP level.At the same time, a set Q of six-levels requirement is considered where their schemes are defined by: • priority service (1+1) corresponding to q 0 = 100%; • protected services have the following values: level M2 q 2 = 75%, level M3 q 3 = 50%, level M4 q 4 = 25%; • preemptible service q 5 = 0%.
In this way, each one of these six levels will be used to evaluate the sets of traffic request U 10, U 20, U 30, and U 40.
The algorithm FLSA is performed ten times for each test instance in the evaluation process to get the average costs as a preliminary study.The outcome for these tests shows when the number of request increases, then a higher recovery probability would be necessary and therefore a greater amount of resources.
Figure 5 presents results that indicate the number of requests and requested level separated by a dash.It can be seen that as QoP levels decrease, average costs also decrease.That is, the cost and QoP are objective functions in conflict.In addition, the correlation between the levels and costs for each set U 10, U 20, U 30, and U 40 are 0.88, 0.84, 0.91, and 0.91 respectively.These correlations indicate the existence of a strong relationship between the requested QoP, and the average cost.

Test 3: RWA-PLF with heterogeneous QoP requirements
The set of requests was generated randomly under certain criteria according to: • Number of Sessions (n): is the request connections number for each network node.
• Minimum Hops (h): is the minimum distance between source and destination nodes; where the distance between adjacent nodes is fixed to one.
For each source node a destination node is randomly selected according the previous criteria.For this experiment, the following values define the test cases: • Number of Sessions (n): 1, 2, 3, 4, 5.
The results show that by increasing the parameters h and n, the cost tends to increase, whereas if σ increases, the changes are not significant.The results are shown in Figures 6a, 6b, and 6c.For example: U1-2-10 represents all tests where applications meet with the Number of Sessions (n)=1, Minimum Hop (h) = 2, and Standard Deviation (σ)=10 .

Conclusions and Future Work
In this paper we have proposed a paradigm for quantifying the grade of protection service in a generic way.Protection resources can be dedicated and shared.The first scheme shows 100% recovery and is it warranted by dedicated protection.The second scheme is associated with a measure based on recovery probability, which indicates the grade of conflict among users with shared resources.Basically, in this work the recovery probability concept is proposed to be used in order to define different grade of conflicts which conform a set of QoP levels.
The FLSA algorithm, based on Genetic Algorithm, was developed to solve the RWA-PLF problem with the flexible QoP approach.FLSA calculates primary and backup lightpaths, considering the QoP requirement of each unicast request and the set of pre-defined QoP levels.
These preliminary experimental results have shown that the proposal provides flexibility at the moment to define the QoP levels before the optimization process.Thus, the network administrator can determine the QoP levels necessary for business with the following advantage: flexibility in the definition of the business, by having a list of available and adjustable QoP levels.This will finally allow a fair administration of the assigned resources, from the viewpoint of the customer, i.e. customers pay a service in according to QoP level nearest to requested.
Future work will consider the analysis of other QoS variables to create new and more elaborate QoP metrics.On the other hand, it is necessary to perform experimental environment in a pure multi-objective optimization context [22] to study the impact among object functions in conflicts.It is also important to consider other network topologies, multicast traffic and more rigorous statistical tests [23].
η su e is between 0 and |S| − 1 where η su e > 0 if e ∈ p u and e ∈ p v and c eλ ∈ b u and c eλ ∈ b v for some λ = 0 in other case.

Figure 1 :
Figure 1: Structure of the chromosome.
(a) Primary and backup paths on NSF network.(b) Primary and backup paths coded by chromosome.

Figure 2 :
Figure 2: Example of relationship between chromosome and solution.
(a) Individuals to be crossed.(b) Crossing process.

Figure 4 :
Figure 4: Solution S u
}, where |.| indicates cardinality of a set; U set of unicast requests U = {u 1 , u 2 , ..., u |U | }; u = (f, d, k)unicast request u ∈ U with source node f , destination node d, and k-th QoP requirement (q k ∈ Q); l u lightpath which is composed of optical channels, l u = {(c e1λ1 ), (c e2λ2 ), ... , (c enλn )}, e 1 = (s, i 1 ) and e n = (i n , d); p u primary lightpath l u corresponding to the request unicast u; b u backup lightpath l u corresponding to the request unicast u; s u = (p u , b u , v) response service to request u, with primary and backup lightpaths and v-th QoP level assigned to u (q v ∈ Q); λ e u wavelength λ assigned to unicast request u on the link e; S solution in response to all unicast requests; S = {s u1 , s u2 , s u3 , ... , s u |U | }; r recovery probability of service s u when link e ∈ p u failed; P su average recovery probability P of s u ;

Table 1 :
QoP schemes comparison Algorithm 1 FLSA INPUT: set of request U , set of available services Q, network topology G, genetic parameters and vector of links cost C. OUTPUT: solution S. Initialize Initial Population Ω g=0 ; //generation g 2: while Stop condition do not meet do 1:

Table 3 :
Recovery measure P su obtained in Test 1.