The MPLS WG Archive[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index][Thread Index][Author Index][Subject Index] draft-nadeau-ip-basedtool-requirements-01
Dave, A belated "ack": "Interesting document.." I would hope so since the source of these requirements are service operators who in the process of implementing MPLS-based networks! Seriously, many thanks for your valuable comments -- we will integrate these into a next revision for clarity. Best regards, //Monique >Tom/Monique/George: > >Interesting document, a "few" comments though ;-) > >Section 2: >"Any error condition that prevents an LSR"....do you mean LSP here? > >Section 3: > >'a' this munges together detection and diagnosis. Shouldn't path liveliness >and path tracing be separated out as separate requirements. (They seem to be >synonymous in this document). > >'b' Are you inheriting all the requirements of the GTTP requirements >document by reference. There is a lot of stuff in there that is more generic >and may not be applicable. Is there a specific difference between tunnel >trace and path trace in this document? > >'c' The phrase "automation of path tracing..." are you really discussing >misbranching/mismerging detection. My read of this is that I am required to >periodically perform traceroute on LSPs. I doubt that is the actual intent. > >'d' [LSPPING] for CE to CE verification. Besides violating the >"non-prescriptive" claim in the introduction, I didn't think MPLS went CE to >CE ;-) The wording seems to confuse path tracing as a response to failure >detection, and the mechanism for failure detection as outlined in 'b'. Also >some requirements seem to do with the tools, and others with respect to box >behaviors (e.g. suggesting a nodal response to failure detection), it might >be cleaner simply to stick to what the tools should do, and let folks roll >up the system behaviors into applications themselves. > >'e' If I understood 'b' correctly, is this not the same thing? > >'f' Requirements 'f', 'l' and 'm' seem to be largely overlapping, although I >cannot actually parse 'l', and may be interested in latency for reasons >other than SLA. > >'g' what is a "device" in this context, and that it only "may" take >corrective action runs counter to the first sentence suggesting the network >MUST self-heal. The general requirement is rather high level. I've preferred >the wording that "the network detect faults, and may facilitate automated >response to restore service (e.g. via fault notification or whatever)". > >'h' Agree > >'i' What OAM functions are common to how both P2P and MP2P are instrumented? >Presumably one solution that meets the requirement is to overlay P2P on >MP2P.... > >'j' In general agree, with the proviso that some synchronization of tool >usage/frequency is required for availability metrics, e.g. the network needs >to function as a system and the OAM functions harmonized across the system. > >'k' The wording seems to undermine requirements 'h' and 'r'. IMHO this seems >to highlight one problem with instrumenting load balancing (as per ECMP >etc.) in general. The approach is that all paths need to be tested, the >specific implication is that they all need to be able to be tested from any >point. That I do not think is sustainable. If OAM probes are to follow the >same path as the user traffic, then ECMP should function independently of >both the OAM probes and the user traffic. Otherwise we're into the "monkeys >and typewriters" verification model as the OAM probes try various >permutations to impersonate user traffic's ECMP characteristics (which by >definition is proprietary and therefore unknowable to the probe origin). > >'l' See 'f'. > >'m' This one is not clear to me, and sort of suggests using bursts of OAM >packets to characterize LSP performance, a clarification please..? > >'n' The actual example is rather trivial as all the LSPs discussed terminate >in a single box. What is required is alarm suppression in all clients >regardless of where the client LSP, PW, VCC, VPC etc. terminates. > >'o' This seems othogonal to the other requirements, and seems relatively >hard not to satisfy. In other words, why is it here? > >'p' Seems prescriptive. IMHO detection of, and limitation of >damage/disruption done by DOS attacks is the requirement. "How" is for >another day. > >'q' OAM interworking for fault notification is only a part of the problem. >The client may be monitored, therefore timliness of this stuff matters as it >either needs to harmonize with the client OAM, or the client OAM needs to be >configured to "hold off" before reacting. Which also implies a requirement >for bounded detection times, which is one nit I have with current proposals >for instrumentation of ECMP and other load sharing mechanisms (one offshoot >being the "guidelines for load balancing" draft to be presented later on the >WG agenda). > >'r' This is a rather long winded way to say that "liveliness must test the >actual forwarding path (proxy verification of what systems "think" is >happening is insufficient)". > >And finally, nothing in the document leap out at me as justifying the title >(except perhaps the use of SNMP ;-) > >cheers >Dave > >
|
|