The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2001-Dec> msg00243



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

MPLSOAM BOF meeting draft minutes

  • From: Curtis Villamizar <curtis@workhorse.fictitious.org>
  • Date: Tue, 18 Dec 2001 10:02:01 -0500
  • cc: curtis@fictitious.org, Shahram_Davari@pmc-sierra.com, rbonica@mci.net, pingpan@juniper.net, gash@att.com, mpls@UU.NET


[Note: Ron see request below.]

In message <B9571FDEBD3DD21181E500606DD5EE050E891C65@mbddmknt01.hc.bt.com>, nei
l.2.harrison@bt.com writes:
> Curtis Villamizar  Sent: 18 December 2001 01:17
> <some items snipped>
> > > > The sender must have feedback to avoid an unintended DoS 
> > attack on the
> > > > egress, so LSP Ping was a distinct advantage.
> > > 
> > > Since Harrison's proposal requires a fixed rate of 1 CV 
> > packet per second,
> > > it is in fact very easy to detect DOS attacks.
> > 
> > Fixed rate is not adequate.  Some will find this too frequent.  Others
> > will need finer granularity.
> NH=> See the excellent points made by John Rutemiller last night also on
> this.  Curtis you seem to be continually missing the point.  Let me try and
> put a few things in context here and see if this helps you:
> -	you *don't* have to use CV on all LSPs OK

I does not make sense to deploy an open ended protocol with the
potential for one router to overload another.  I don't know if it has
ever been explicitly stated, but avoiding that type of situation by
providing feedback to the sender has always been a goal.

The rate should be configurable.  If you want to fix it at one second
in your network you can still do so.  We will never get useful
performance measurement through sending probe packets unless our
performance criteria is abysmally poor.  The T3-NSFNET circa 1994 used
10^-4 packet loss (0.01% loss) as the target.  That would require at
least a multiple 10,000 seconds or almost three hours to have any
confidence at all of catching rate of loss at that rate.  Even 0.1%
loss (fair IP service, not good, not terrible either) would require
measurement over a multiple of 1,000 seconds (which is actually done
for some SLA checks).  Low rate ping or OAM CV is just a connectivity
check.  Probe counting is a poor measurement of performance.

> -	you can decouple source/sink processing (might not be obvious
> why.....but here are some pointers:  CV is unidirectional (for good reasons
> as explained why by several people already).  CV generation is trivially
> easy.  Some LSPs don't need CV *monitoring*, ie unimportant ones (possibly
> potential candidates for Ping).  CV can detect more than simple breaks, like
> various misbranching/merging/config cases and immediately identify
> offender.....and if you have some important customers whose traffic
> integrity needs protection this might be considered quite important)

Your first point is ragarding efficiency.  There is no strong argument
for putting the load on the egress.  Not having any feedback to sender
is a poor design.  We can agree to disagree on this.

Your second point is regarding another type of failure.  One traffic
flow may show up for no good reason at the egress of the wrong LSP
carrying very important traffic.  If the former LSP is not monitored,
no MPLS-OAM traffic is sent and it is not detected and no LSP Ping is
done so it is not detected.  If that LSP is monitored, MPLS-OAM
detects the error as traffic showing up at an egress where it doesn't
belong and LSP Ping detects it as the wrong egress in the return
address on the ICMP reply.  LSP Ping may need a less frequent UDP
probe that identifies the LSP and sends that back to the ingress to
insure that the packet not only reaches the correct egress LSR but
also emerges on the correct LSP.

Current hardware does not remember the prior label after it is popped,
so there may be a problem identifying the LSP after inspecting what is
underneath it.  For that we made need to limit TTL, recognize the the
bottom label is special (whether MPLS-OAM or IPv4 Explicit Null) and
process according to the banished TTL expired rules.

Maybe what we really need is to resurect Ron Bonica's MPLS ICMP TTL
exceeded draft and specify that for non-IP LSP, a IPv4 Explicit Null
is used in the bottom (with a ICMP echo reply).  You keep sending
larger TTL getting ICMP TTL exceeded until you get a ICMP echo reply.
This would trace the data path including labels for non-IP LSPs but
may raise objections on religious grounds.  The again, we could just
do it.  [Ron - would you mind adding this usage and respinning as
draft-ietf-mpls-icmp-03.txt?  We can try for updated radioactive
historic banished RFC status.]

> -	CV rate needs to be fixed *if* you want standardised defects
> (entry/exit and their consequent action handling, like sending FDI to
> supress client layer alarms....did you see my analysis of ATMoverMPLS last
> night?) and, based on this, standardised availability criteria....if you
> vary the rate these metrics vary too....ergo no consistent behaviour from
> LSP to LSP and no chance to create consistent SLAs.  Some operators might
> find some of these reasons important for certain customers and none of this
> has been considered so far in Ping.

I'd hate to be taking corrective action based on a one packet per
second probe.  If the send rate is configurable there is nothing to
stop you from setting it to one second everywhere in **your** network.

> -	CV needs to be unidirectional (many reasons) but in context of
> previous point availability/QoS have to monitored in a unidirectional sense.

It doesn't **need** to be unidirectional.  We can agree to disagree as
to whether the advantages of feedback to the sender is more important
than probe traffic being unidirectional.

> -	etc
> So it depends on application.  I would never consider monitoring all the
> PSTN connections in our telephony network....I'd use a reactive tool like
> Ping (we actually use a form of sampling in practice to spot latent faults).
> This is good enough, and appropriate, for this network.  On more important
> connections (like current leased lines/VPNs) we want the ability to monitor
> the trails since they have stringent SLAs associated with them.  I would
> never consider *sink* processing of CV on all the LSPs that carry
> BE/Internet traffic.....but I might want it on important LSPs, like VPNs,
> LSP transit services, XoverMPLS emulation services, etc.  CV has been
> designed very much with how operators have needed to provide services to
> important customers/applications.  Its horses for courses....does that help
> or not?  Maybe we should have both mechanisms and let customers (ie
> operators) choose which one they want?

Monitoring every active PSTN pair passing through a major junction or
every pair of IP address on the planet is infeasible so you don't need
to mention that you don't plan to do it.

Most IP providers today that use MPLS for BE Internet traffic **do**
currently send pings through all of those LSPs to make sure they stay
up.  There are only a small number of those.  It is VPN and XoverMPLS
that generate a large number of labels.  It may be those VPN and
XoverMPLS LSPs that need to be monitored end to end.  If you plan to
succeed with the service you need to expect a large number of them.
Specifying protocols with the ability to scale is an explicit goal
stated by the IAB.

Curtis