The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2001-Oct> msg00132



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

LSP Failure/Recovery

  • From: "David Allan" <dallan@nortelnetworks.com>
  • Date: Tue, 9 Oct 2001 16:40:01 -0400
  • X-Orig: <dallan@americasm01.nt.com>

Title: RE: LSP Failure/Recovery

Venkata/Mani:

-----Original Message-----
From: Naidu, Venkata [mailto:Venkata.Naidu@Marconi.com]
Sent: Tuesday, October 09, 2001 1:01 PM
To: 'manis@futsoft.com'; 'Sachin Kalra'; mpls@uu.net




-> If you have a "local repair" support implemented, Router 16 will
-> reroute the LSP from itself (Router16) to the destination. It is not
-> required for the rerouting to be done from the Ingress or the head
-> of the LSP - which is known as "Global repair".

>  Strictly speaking, there is no requirement that only the
>  affected router (Router 16 here) should do "local repair".
>  Any router/LSR which has sufficient information can do
>  local repair (for example router 15 or 14 etc)

Technically the definition of local repair in the framework document is THE router upstream of the failure taking action. Pragmatically, if the router is not adjacent to the failure, then you need fault propagation, more complex diversity algorithms, and endure longer outages on failure (due to propagation delay).

 
-> Secondly Router 16 can reroute avoiding Router 17, only when
-> Router 17 is not an explicit Hop to be reached from Router 16 for
-> the rerouted LSP

>  Not necessarily true. If one of the links between Router 16
>  and Router 17 fails, the _only_ requirement is to avoid the
>  failed "link" and not the complete "node" (Router 17 here).

>  So Router 16 can locally route the LSP to another parallel
>  link to Router 17 _OR_ another disjoint path to 17. What you
>  are trying to achieve here is link disjointness and not
>  node disjointness.

I'm hearing about 3 things here:

- the fact that a backup route that is diverse with the failed resource may not exist.
- the requirement is for the backup path to avoid the failure (and you may not be able to distinguish link and node failures in which case pre-planned recovery paths need to go to the next-next-hop).

- a scenario where a failure can authoritratively be localized to one link in a multi-link adjacency.

So my failure detection tells me what it can, and I do recovery accoringly (including declaring real "hard" failure when an alternative does not exist).

  
-> W.r.to your questions, If Router 16, has indicated the ingress
-> and done the reroute (global repair), It will not indicate the
                                           ^^^^^
-> ingress to reroute again when the link between 16 and 17 comes
-> up.

>  Router 16 MAY indicate the ingress LER when the link between
>  Router 16 and 17 comes back up. There are no hard and fast rules...

The rules depend on whether the network recovers and re-uses those LSP resources that survive a failure or not.

Dave