The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2002-Nov> msg00092



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

draft-nadeau-ip-basedtool-requirements-01

  • From: Monique Morrow <mmorrow@cisco.com>
  • Date: Fri, 15 Nov 2002 16:23:01 +0100
  • Cc: "Thomas D. Nadeau" <tnadeau@cisco.com>, swallow@cisco.com, mpls@UU.NET

Dave,

A belated "ack":

"Interesting document.."   I would hope so since the source of these 
requirements are service operators who in the process of implementing 
MPLS-based networks!

Seriously, many thanks for your valuable comments -- we will integrate 
these into a next revision for clarity.

Best regards,
//Monique


>Tom/Monique/George:
>
>Interesting document, a "few" comments though ;-)
>
>Section 2:
>"Any error condition that prevents an LSR"....do you mean LSP here?
>
>Section 3:
>
>'a' this munges together detection and diagnosis. Shouldn't path liveliness
>and path tracing be separated out as separate requirements. (They seem to be
>synonymous in this document).
>
>'b' Are you inheriting all the requirements of the GTTP requirements
>document by reference. There is a lot of stuff in there that is more generic
>and may not be applicable. Is there a specific difference between tunnel
>trace and path trace in this document?
>
>'c' The phrase "automation of path tracing..." are you really discussing
>misbranching/mismerging detection. My read of this is that I am required to
>periodically perform traceroute on LSPs. I doubt that is the actual intent.
>
>'d' [LSPPING] for CE to CE verification. Besides violating the
>"non-prescriptive" claim in the introduction, I didn't think MPLS went CE to
>CE ;-) The wording seems to confuse path tracing as a response to failure
>detection, and the mechanism for failure detection as outlined in 'b'. Also
>some requirements seem to do with the tools, and others with respect to box
>behaviors (e.g. suggesting a nodal response to failure detection), it might
>be cleaner simply to stick to what the tools should do, and let folks roll
>up the system behaviors into applications themselves.
>
>'e' If I understood 'b' correctly, is this not the same thing?
>
>'f' Requirements 'f', 'l' and 'm' seem to be largely overlapping, although I
>cannot actually parse 'l', and may be interested in latency for reasons
>other than SLA.
>
>'g' what is a "device" in this context, and that it only "may" take
>corrective action runs counter to the first sentence suggesting the network
>MUST self-heal. The general requirement is rather high level. I've preferred
>the wording that "the network detect faults, and may facilitate automated
>response to restore service (e.g. via fault notification or whatever)".
>
>'h' Agree
>
>'i' What OAM functions are common to how both P2P and MP2P are instrumented?
>Presumably one solution that meets the requirement is to overlay P2P on
>MP2P....
>
>'j' In general agree, with the proviso that some synchronization of tool
>usage/frequency is required for availability metrics, e.g. the network needs
>to function as a system and the OAM functions harmonized across the system.
>
>'k' The wording seems to undermine requirements 'h' and 'r'. IMHO this seems
>to highlight one problem with instrumenting load balancing (as per ECMP
>etc.) in general. The approach is that all paths need to be tested, the
>specific implication is that they all need to be able to be tested from any
>point. That I do not think is sustainable. If OAM probes are to follow the
>same path as the user traffic, then ECMP should function independently of
>both the OAM probes and the user traffic. Otherwise we're into the "monkeys
>and typewriters" verification model as the OAM probes try various
>permutations to impersonate user traffic's ECMP characteristics (which by
>definition is proprietary and therefore unknowable to the probe origin).
>
>'l' See 'f'.
>
>'m' This one is not clear to me, and sort of suggests using bursts of OAM
>packets to characterize LSP performance, a clarification please..?
>
>'n' The actual example is rather trivial as all the LSPs discussed terminate
>in a single box. What is required is alarm suppression in all clients
>regardless of where the client LSP, PW, VCC, VPC etc. terminates.
>
>'o' This seems othogonal to the other requirements, and seems relatively
>hard not to satisfy. In other words, why is it here?
>
>'p' Seems prescriptive. IMHO detection of, and limitation of
>damage/disruption done by DOS attacks is the requirement. "How" is for
>another day.
>
>'q' OAM interworking for fault notification is only a part of the problem.
>The client may be monitored, therefore timliness of this stuff matters as it
>either needs to harmonize with the client OAM, or the client OAM needs to be
>configured to "hold off" before reacting. Which also implies a requirement
>for bounded detection times, which is one nit I have with current proposals
>for instrumentation of ECMP and other load sharing mechanisms (one offshoot
>being the "guidelines for load balancing" draft to be presented later on the
>WG agenda).
>
>'r' This is a rather long winded way to say that "liveliness must test the
>actual forwarding path (proxy verification of what systems "think" is
>happening is insufficient)".
>
>And finally, nothing in the document leap out at me as justifying the title
>(except perhaps the use of SNMP ;-)
>
>cheers
>Dave
>
>