The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2005-Aug> msg00015



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

[mpls] Re: Reg: Fast Reroute Timings in RFC 4090

  • From: richard.spencer@bt.com
  • Date: Thu, 11 Aug 2005 08:10:28 +0100
  • Cc: mpls@ietf.org
  • Thread-Index: AcWdxlSfyRGYV4oFSaeOBrd0BK2DGAAZte1wAAPONoA=
  • Thread-Topic: Reg: Fast Reroute Timings in RFC 4090
  • X-MIME-Autoconverted: from quoted-printable to 8bit by cell.onecall.net id j7B726r06036
  • X-OriginalArrivalTime: 11 Aug 2005 07:10:33.0920 (UTC)FILETIME=[C40E5C00:01C59E43]

Pranjal,

Total node failures (i.e. a node completely dies) will be detected by all directly connected peers as the links will go down at the physical layer. If you want to detect IP/MPLS layer failures, the IETF's protocol for doing this is BFD (see: http://www.ietf.org/html.charters/bfd-charter.html). You can run BFD on a per LSP basis (in the data plane) and also at the IP forwarding level (essentially in the control plane) and the detection time is user configurable.

However, I think its unlikely that you will be able to achieve reliable failure detection much faster than 500ms-1000ms. If the nodes supported it, you could adjust the timers to detect failures quicker than this but the network would be highly susceptible to false failures due to things like network congestion and short error bursts. Also, running multiple BFD sessions with very fast detection rates will consume a significant amount of processing resources. 

In general, when considering failure protection there is a tradeoff between network reliability and fast switchover. IMO, optimising a large SP network to protect against specific failures (i.e. MPLS node forwarding plane failures) faster than 500ms-1000ms (at the expense of processing resources and network reliability) is unlikely to be technically or commercially viable.

You mentioned packet loss for voice/video traffic in your original email, what do you think the impact of a 500ms-1000ms loss will have at the application layer? Can you give some examples?

Regards,
Richard

> -----Original Message-----
> From: mpls-bounces@lists.ietf.org 
> [mailto:mpls-bounces@lists.ietf.org]On
> Behalf Of Dutta, Pranjal
> Sent: 11 August 2005 05:39
> To: David Charlap
> Cc: mpls@ietf.org
> Subject: [mpls] Re: Reg: Fast Reroute Timings in RFC 4090
> 
> 
> 
> Hi,
>              I think most of the vendors implement min 1 seconds as
> hello-intervals as lesser than that might lead to scalability 
> issues (I
> am not sure if that is the reason why vendors implement at 
> least 1 sec).
> Hardware level detection will be possible only when physical layer has
> end-to-end signaling mechanism for failure. In the case where 
> end-to-end
> signaling mechanism is not possible, the PLR will still see 
> that link is
> up and keep forwarding traffic to downstream LSR, but 
> actually the link
> is failed. I am basically looking for some mechanism that can detect
> "Node Failure" as early as possible and PLR will initiate node
> protection accordingly. I think some kind of OAM mechanism might help.
> 
> Thanks,
> Pranjal
> 
> 
> 
> Date: Wed, 10 Aug 2005 10:02:20 -0400
> From: David Charlap <David.Charlap@marconi.com>
> Subject: Re: [mpls] Reg: Fast Reroute Timings in RFC 4090
> To: mpls@ietf.org
> Message-ID: <42FA08EC.8050502@marconi.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Dutta, Pranjal wrote:
> > 
> >                As per RFC 4090- Fast Reroute Extensions to 
> RSVP-TE for
> 
> > LSP Tunnels, it takes 10 ms to reroute the LSP at PLR either by
> facility 
> > back up or one-to-one back up method. The re-route timing is
> definitely 
> > lesser when compared to time taken for global rerouting (at Ingress 
> > LSR). My concern is PLR will reroute the LSP locally only 
> when it has 
> > the fault notification (after missing hellos from the downstream 
> > neighbor). Assuming that min 1 second of RSVP hello 
> interval is being 
> > used, it will take at least 1 second in total to re-route 
> the traffic 
> > locally which will lead to loss of voice/video packets and affects 
> > traffic performance. I would like to know if there is some effort
> going 
> > on in this group to address this issue. Any pointers will 
> be helpful.
> 
> First of all, the Hello interval may be faster than 1s, 
> depending on the
> 
> implementation.  The RFC actually recommends 5ms intervals, 
> but I don't 
> know if anyone actually implements that rate.
> 
> Second, a switch may detect a link failure by means other 
> than loss of 
> RSVP-Hello.  For instance, the hardware may detect a loss of 
> carrier on 
> the interface.  (As a matter of fact, I would expect hardware 
> detection 
> of link failure to be the only way that can reroute traffic 
> fast enough 
> for real-time applications.)
> 
> If a switch loses RSVP Hellos, but does not lose carrier, 
> then the link 
> is still physically active.  It is probably still forwarding 
> data-plane 
> traffic as well (even though the control plane has probably 
> failed.)  So
> 
> actual losses may not be nearly as severe as you are assuming.
> 
> -- David
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> mpls mailing list
> mpls@lists.ietf.org
> https://www1.ietf.org/mailman/listinfo/mpls
> 
> 
> End of mpls Digest, Vol 16, Issue 7
> ***********************************
> 
> _______________________________________________
> mpls mailing list
> mpls@lists.ietf.org
> https://www1.ietf.org/mailman/listinfo/mpls
> 

_______________________________________________
mpls mailing list
mpls@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/mpls