The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2005-Aug> msg00013



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

[mpls] Reg: Fast Reroute Timings in RFC 4090

  • From: neil.2.harrison@bt.com
  • Date: Thu, 11 Aug 2005 00:00:30 +0100
  • Thread-Index: AcWdtHxFXDtth7gWTeOjBuDCOKAEQAAMvOcg
  • Thread-Topic: [mpls] Reg: Fast Reroute Timings in RFC 4090
  • X-MIME-Autoconverted: from quoted-printable to 8bit by cell.onecall.net id j7AMoWr27093
  • X-OriginalArrivalTime: 10 Aug 2005 23:00:30.0879 (UTC)FILETIME=[4E7A5AF0:01C59DFF]

David Charlap wisely wrote 10 August 2005 15:02
> 
> Dutta, Pranjal wrote:
> > 
> >                As per RFC 4090- Fast Reroute Extensions to
> RSVP-TE for
> > LSP Tunnels, it takes 10 ms to reroute the LSP at PLR
> either by facility
> > back up or one-to-one back up method. The re-route timing
> is definitely
> > lesser when compared to time taken for global rerouting (at Ingress
> > LSR). My concern is PLR will reroute the LSP locally only 
> when it has
> > the fault notification (after missing hellos from the downstream
> > neighbor). Assuming that min 1 second of RSVP hello 
> interval is being
> > used, it will take at least 1 second in total to re-route
> the traffic
> > locally which will lead to loss of voice/video packets and affects
> > traffic performance. I would like to know if there is some 
> effort going
> > on in this group to address this issue. Any pointers will
> be helpful.
> 
> First of all, the Hello interval may be faster than 1s,
> depending on the 
> implementation.  The RFC actually recommends 5ms intervals, 
> but I don't 
> know if anyone actually implements that rate.
> 
> Second, a switch may detect a link failure by means other
> than loss of 
> RSVP-Hello.  For instance, the hardware may detect a loss of 
> carrier on 
> the interface.  (As a matter of fact, I would expect hardware 
> detection 
> of link failure to be the only way that can reroute traffic 
> fast enough 
> for real-time applications.)
NH-> Correct....if folks want fast restoration, then infrastructure
failure protection is the only viable means.  All layer networks need
their own (independent from any other layer network) OAM, and the speed
of defect detection/handling (and protection) should follow the general
rule 'closer the application the slower, closer the duct the faster'.
Unfortunately one could consider some of the PWE3 stuff turns sensible
client/server layer relationships on their head. So it's not always
possible to set any general/logical rules for a given layer network when
some folks think, for example, SDHoverIP is a sensible thing to do.
BTW - If not obvious, I don't think this is sensible at all.
> 
> If a switch loses RSVP Hellos, but does not lose carrier,
> then the link 
> is still physically active.  It is probably still forwarding 
> data-plane 
> traffic as well (even though the control plane has probably 
> failed.)  So 
> actual losses may not be nearly as severe as you are assuming.
NH=> Not only does each data-plane need its own OAM (and for folks who
are familiar with G.805 func arch, this should be generic and be
inserted/extracted at the trail termination point as part of the
characteristic information (CI) of that layer network) but this is
different (actually lower level) to whatever OAM is required to maintain
the various functional protocols that a layer network is carrying.  For
example, in GMPLS I can have 2 disjoint layer networks in SDH carrying
traffic (eg VC4) and the control/management plane protocols for that VC4
layer network, BUT these 2 logically different data-plane layer networks
usually come together at an even lower network layer....in the VC4 case
given this would usually be the STM-N level below this.  So what I am
driving at here is that whilst the *traffic* carrying VC4 layer network
will use its BIP_8 and trail source address (J0 byte) CI carried in the
VC4 overhead to provide the traffic layer network OAM, the
control/management protocols will have their own OAM running in a
seperate layer network.....and this will be far more sophisticated than
basic data-plane OAM since there are timers/acknowledgements associated
with these (eg routing,  signalling) protocols.  And one key message
here is this....don't use control/management-plane protocols to proxy
for *traffic* data-plane OAM in the general case.  These are different
OAM functions and they often reside in different layer networks (even
within the same networking system).

However...and having said all that.....I am always puzzled why some
folks think really rapid restoration MUST be done for voice.  I can't
believe anyone would clear down a call after 10s, 100s or maybe even a
few 1000 ms of loss.  I know there is some macho 'must equal/beat SDH's
50ms' mindset at work here.....but this is nuts.  I know the folks who
set the 50ms defect detection threshold in SDH and this was NOT done to
accomodate voice, or any application for that matter.  It was simply
down to the physics and ring topologies being considered.

If you invoke protection too quickly you can make things far worse than
letting the (for example) error event just go away.  However, this
depends on the layer network you are considering.  If you really are
down close to the physical media/section level (and not some artificial
inversion of this as per PWE3) the OAM defect detection/handling ought
to be fast.....and so in such a case its is sensible to invoke
protection once it is truly known this is a real failure case.

regards, Neil
> 
> -- David
> 
> _______________________________________________
> mpls mailing list
> mpls@lists.ietf.org
> https://www1.ietf.org/mailman/listinfo/mpls
> 

_______________________________________________
mpls mailing list
mpls@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/mpls