The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2002-Apr> msg00215



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

comments on nagarajan-ccamp-mpls-packet-protection-00.txt

  • From: Curtis Villamizar <curtis@workhorse.fictitious.org>
  • Date: Wed, 24 Apr 2002 14:23:19 -0400
  • cc: mpls@UU.NET


In message <B9571FDEBD3DD21181E500606DD5EE0514BABC19@mbddmknt01.hc.bt.com>, nei
l.2.harrison@bt.com writes:
> 
> 1	I noted a criticism in the mtg notes that it uses 2x the BW of one
> LSP....well yes it is 1+1 so I'd say that's obvious.  However, I think we
> have to place any such criticism of BW efficiency in context.  For example,
> if I compare it with LDP as a server layer to (say) rfc2547 VPNs, then
> because there is no relationship between a pkt's up-state QoS forwarding
> treatment and a pkt's survivability requirements (vis-à-vis same or
> different DS-coded pkts of *any* VPN), operators are forced to over-engineer
> such networks and *hope* (because there is no assurance) that traffic
> survives under failures....a factor of 2x over-engineering on some DS
> classes is not uncommon.
> 
> Hence, the point I want to make here is that using BW wisely to
> reduce/remove a complexity/problem is one thing (which I support), but
> asking an operator to throw BW at a problem that should not really exist
> (because the application/problem in question has only been partially
> defined, eg just the connectivity bit of VPNs) is something else.  So any
> criticism of BW efficiency only makes sense against the context of the
> application/problem it is addressing IMO.



Point of failure protection schemes (local-protect and fast-reroute)
provide 50msec recovery.  I'll just say FRR for short.  This technique
doesn't require sending double the bandwidth.  Some implementations
might not make the 50msec but some do.

Fully disjoint paths from the ingress yield from 1/4 to 1-2 seconds
recovery (depending again on who's implementation) and these require
less signaling than point of failure protection.  In our
implementation these are called "standby" LSPs.

Reservations on the backups (using either FRR or standby) at a
numerically higher setup/hold priority can help avoid congestion (or
completely avoid it) during the failure without impacting the traffic
engineering of the primary path (on the numerically lower priority).

The ISPs that I've talked to about the topic of restoration indicate
that although fast restoration is highly desireable for some services,
a strong requirement is efficient use of available bandwidth.  The
goal is to allow the most cost effective delivery of service and keep
the financial bottom line in the black.  This may have something to do
with a competative business environment but whatever the reason, the
"throw more bandwidth at it" approach does not seem to be in favor.

So far signaling extensions to share the backup bandwidth have been
proposed but gone nowhere (perhaps implementation will help).  This
would yield even better use of resources (or better avoidance of
congestion if overbooking backup bandwidth was eliminated).  Of
course, if the payload is non-IP overbooking backups may not be an
option so this then would just yield better use of resources.

Of course, FRR requires that the point of failure be able to detect
its own failure which may not be the case for optical devices.
Doubling the bandwidth is a steep price to pay to satisfy the
restoration needs.  If the cost of WDM port plus optical switch port
is half that of a router port, that might be fine.  It still might
make a lot more economic sense to use FRR to provide restoration at
the PSC capable boxes outside the optical domain which will generally
be plenty fast enough.

IMHO - there is no need for this draft.  In the end the ISPs/carriers
vote with their dollars and I haven't seen any "yes" votes of that
sort so I won't be coding this quite yet.

As far as VPN is concerned, some providers are using (at least in
trial if not production) LDP over MPLS/TE and using the MPLS/TE
traffic engineering and fast recovery capabilites.  This scales well
if the network is divided into a core and regions keeping the number
of LSPs manageable and can work well if the nodes (routers) at the
borders of the core and regions are very reliable.

Curtis