The MPLS WG Archive[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index][Thread Index][Author Index][Subject Index] [mpls] Re: Reg: Fast Reroute Timings in RFC 4090
Richard, Of course I agree with these observations. 1 further point..... BFD only works on arch valid trail constructs, ie p2p and p2mp. If you violate the co mode by merging (ie LDP/PHP) then the traffic/fault/performance management problems become rather difficult (understatement)...adding ECMP to brew just makes it significantly worse. Check out the complexity of LSP-Ping....N years on, still a draft (rev 9)....this fact alone should tell you something is not quite right here. I'd also advise that if you really want to see what simple always-on defect detection/handling OAM SHOULD look like for the co-ps mode you should have a look at Y.1711. BFD is a sort-of Y.1711 look-alike (has to be, the nature of the functional problem being addressed is the same....so the solution has to be similar). However, Y.1711 is complete of specification. Not sure how many folks implement it as nothing works well with LDP/PHP/ECMP stuff and many folks already have lots of that. Further, given its an ITU Rec and is written with a HW implementation in mind then I suspect many router vendors don't like it. However, the functionality specified in here is correct, complete and minimalist......nothing else exists which comes close to this. If operators want to (i) get opex down (ii) have more customer-oriented proactive network/service management which includes (iii) getting on the availability/performance SLA page one day, they are going to need something like this as a base from which to work from....else these objectives will not be achieveable in any simple and consistently meaningful way. regards, Neil > -----Original Message----- > From: mpls-bounces@lists.ietf.org > [mailto:mpls-bounces@lists.ietf.org] On Behalf Of > richard.spencer@bt.com > Sent: 11 August 2005 08:10 > To: pdutta@riverstonenet.com; David.Charlap@marconi.com > Cc: mpls@ietf.org > Subject: RE: [mpls] Re: Reg: Fast Reroute Timings in RFC 4090 > > > Pranjal, > > Total node failures (i.e. a node completely dies) will be > detected by all directly connected peers as the links will go > down at the physical layer. If you want to detect IP/MPLS > layer failures, the IETF's protocol for doing this is BFD > (see: http://www.ietf.org/html.charters/bfd-charter.html). > You can run BFD on a per LSP basis (in the data plane) and > also at the IP forwarding level (essentially in the control > plane) and the detection time is user configurable. > > However, I think its unlikely that you will be able to > achieve reliable failure detection much faster than > 500ms-1000ms. If the nodes supported it, you could adjust the > timers to detect failures quicker than this but the network > would be highly susceptible to false failures due to things > like network congestion and short error bursts. Also, running > multiple BFD sessions with very fast detection rates will > consume a significant amount of processing resources. > > In general, when considering failure protection there is a > tradeoff between network reliability and fast switchover. > IMO, optimising a large SP network to protect against > specific failures (i.e. MPLS node forwarding plane failures) > faster than 500ms-1000ms (at the expense of processing > resources and network reliability) is unlikely to be > technically or commercially viable. > > You mentioned packet loss for voice/video traffic in your > original email, what do you think the impact of a > 500ms-1000ms loss will have at the application layer? Can you > give some examples? > > Regards, > Richard > > > -----Original Message----- > > From: mpls-bounces@lists.ietf.org > > [mailto:mpls-bounces@lists.ietf.org]On > > Behalf Of Dutta, Pranjal > > Sent: 11 August 2005 05:39 > > To: David Charlap > > Cc: mpls@ietf.org > > Subject: [mpls] Re: Reg: Fast Reroute Timings in RFC 4090 > > > > > > > > Hi, > > I think most of the vendors implement min 1 seconds as > > hello-intervals as lesser than that might lead to > scalability issues > > (I am not sure if that is the reason why vendors implement at least > > 1 sec). Hardware level detection will be possible only when > physical layer has > > end-to-end signaling mechanism for failure. In the case where > > end-to-end > > signaling mechanism is not possible, the PLR will still see > > that link is > > up and keep forwarding traffic to downstream LSR, but > > actually the link > > is failed. I am basically looking for some mechanism that can detect > > "Node Failure" as early as possible and PLR will initiate node > > protection accordingly. I think some kind of OAM mechanism > might help. > > > > Thanks, > > Pranjal > > > > > > > > Date: Wed, 10 Aug 2005 10:02:20 -0400 > > From: David Charlap <David.Charlap@marconi.com> > > Subject: Re: [mpls] Reg: Fast Reroute Timings in RFC 4090 > > To: mpls@ietf.org > > Message-ID: <42FA08EC.8050502@marconi.com> > > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > > > Dutta, Pranjal wrote: > > > > > > As per RFC 4090- Fast Reroute Extensions to > > RSVP-TE for > > > > > LSP Tunnels, it takes 10 ms to reroute the LSP at PLR either by > > facility > > > back up or one-to-one back up method. The re-route timing is > > definitely > > > lesser when compared to time taken for global rerouting > (at Ingress > > > LSR). My concern is PLR will reroute the LSP locally only > > when it has > > > the fault notification (after missing hellos from the downstream > > > neighbor). Assuming that min 1 second of RSVP hello > > interval is being > > > used, it will take at least 1 second in total to re-route > > the traffic > > > locally which will lead to loss of voice/video packets and affects > > > traffic performance. I would like to know if there is some effort > > going > > > on in this group to address this issue. Any pointers will > > be helpful. > > > > First of all, the Hello interval may be faster than 1s, depending on > > the > > > > implementation. The RFC actually recommends 5ms intervals, but I > > don't know if anyone actually implements that rate. > > > > Second, a switch may detect a link failure by means other than loss > > of RSVP-Hello. For instance, the hardware may detect a loss of > > carrier on > > the interface. (As a matter of fact, I would expect hardware > > detection > > of link failure to be the only way that can reroute traffic > > fast enough > > for real-time applications.) > > > > If a switch loses RSVP Hellos, but does not lose carrier, then the > > link is still physically active. It is probably still forwarding > > data-plane > > traffic as well (even though the control plane has probably > > failed.) So > > > > actual losses may not be nearly as severe as you are assuming. > > > > -- David > > > > > > > > ------------------------------ > > > > _______________________________________________ > > mpls mailing list > > mpls@lists.ietf.org https://www1.ietf.org/mailman/listinfo/mpls > > > > > > End of mpls Digest, Vol 16, Issue 7 > > *********************************** > > > > _______________________________________________ > > mpls mailing list > > mpls@lists.ietf.org https://www1.ietf.org/mailman/listinfo/mpls > > > > _______________________________________________ > mpls mailing list > mpls@lists.ietf.org https://www1.ietf.org/mailman/listinfo/mpls > _______________________________________________ mpls mailing list mpls@lists.ietf.org https://www1.ietf.org/mailman/listinfo/mpls |
|