The MPLS WG Archive[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index][Thread Index][Author Index][Subject Index] comments on draft-ietf-mpls-oam-requirements-02
Hi Yetik:
Thanks for your comments, some replies below...
>I have some comments on this draft. I categorized them below (requirements,
editorials, general).
>
>
>Thanks,
>yetik
>
>Requirements:
>------------------
>
>
> 3.1 Detection of Broken Label Switch Paths
> [snip]
>
> If the time to detect defects is specified and tools designed
> accordingly then a harmonized operational framework can be
> established both within MPLS levels, and with MPLS applications.
> If the time to detect is known, then automated responses can be
> specified both w.r.t.with regard to resiliency and SLA
> reporting. One consequence is that ambiguity in maintenance
> procedures MUST be minimized as ambiguity in test results impacts
> detection time.
>
>
>This requirement is ambiguous to me. We need the OAM tools, but we have to
utilize the tools in a correct way to make the >best out of them. Does this
requirement say that every OAM tool MUST do what it is supposed to do?
I don't quite get the disconnect other than that the requirement probably
should be two requirements. If you are going to specify the follow on
procedures for detecting a problem both with respect to recovery, hold off
actions, and SLA reporting, then in order to be useful you need some idea of
how long it takes to detect the problem.
The second order effect is that if some OAM procedure provides an ambiguous
result, then the overall response time gets pushed out and complicates SLA
reporting. Not the end of the world, but somthing to be minimized as much as
possible (nice to have black and white instead of
working-suspect-not-working as the set of states).
> [snip]
> Detection tools should have minimal
> dependencies on network components that do not implement the LSP.
>
>Do we mean P routers here or non-MPLS routers by network components?
This is one way to reduce ambiguity in results. An LSP is a uni-directional
thing, a test that depends on bi-directional connectivity depends on the
return path as well. ANy ping failure therefore provides an ambiguous
result.
> 3.2 Diagnosis of a Broken Label Switch Path
>
> The ability to diagnose a broken LSP and to isolate the failed
> resource in the path is required.
>
>What specific resource is intended here? A PE router's CPU can be
overloaded, and its routes can flap, which may result in >LSP flapping. Do
we want MPLS OAM tools to determine that the CPU of that PE is the
root-cause?
The reference is more with respect to the ability to traceroute an LSP and
isolate the problem. Perhaps a re-wording?
> 3.3 Path characterization
>
> The ability of a path trace function to reveal details of LSR
> forwarding operations relevant to OAM functionality. This would
> include but not be limited to:
> - use of pipe or uniform TTL models by an LSR
> - externally visible aspects of load spreading (such as
> ECMP), including type of algorithm used
> examples of how algorithm will spread traffic
> - data/control plane OAM capabilities of the LSR
> - stack operations performed by the LSR (pushes and pops)
>
>The first sentence is not complete. Hence I do not understand this section.
Does the bullet list provide the conditions
>when the path trace gives useful results?
The bullet list refers to things that would be useful to be able to learn
via a path trace function. A lot depends on how TTL is handled (for example)
so variations in copying between layers is useful to know when trying to
interpret the results.
> 3.5 Frequency of OAM Execution
>
> [snip]
> To elaborate, there are defect conditions (specifically
> misbranching or misdirection of traffic) for probe based detection
> mechanisms combined with automated network response requires
> harmonization of probe insertion rates and probe handling across
> the network in order to avoid flapping.
>
>This is not clear. I believe it says that the determination of the
frequency of OAM Execution is not science, therefore
>flexibility is needed to be able to adjust based on the network
observations. Is that true?
No, the intent is more to illustrate that there are classes of problems that
have dependencies on more than one LSP (e.g. leakage), and if you are
monitoring for leakage, then you need to move towards a common detection
time, otherwise leakage will cause flapping.
> One observation would be that commoditization of MPLS, common
> optimized implementation of monitoring tools and the need for inter-
> carrier harmonization of defect and SLA handling will drive
> specification of OAM parameters to commonly agreed on values and
> such values will have to be harmonized with the surrounding
> technologies (e.g. SONET/SDH, ATM etc.) in order to be useful.
>
>What does commoditization (I don't know if there is such a word) mean here?
Does it mean "wide-spread deployment"?
Yes...that may be a better term.
>I find this section not clear, the sentences are too long.
>
> 3.6 Support for OAM Interworking for Fault Notification
>
> An LSR supporting OAM functions for pseudo-wire functions that
> join one or more networking technologies over MPLS must be
> able to translate an MPLS defect into the native technology's
> error condition. For example, errors occurring over the MPLS
> transport LSP that supports an emulated ATM VC must translate
> errors into native ATM OAM AIS cells at the edges of the pseudo-
> wire. The mechanism SHOULD consider possible bounded detection
> time parameters, e.g., a "hold off" function before reacting as
> to harmonize with the client OAM. One goal would be alarm
> suppression in the psuedo-wire's client layer. As observed in
> section 3.5, this requires that the MPLS layer perform detection
> in a bounded timeframe in order to initiate alarm suppression
> prior to the psuedo-wire client layer independently detecting the
> defect.
>
>Does this not belong to PWE3?
Possibly, however as PWE uses MPLS tools (e.g. LSP-PING mode in VCCV), teh
requirement will end up here somehow...
> 3.8 The commoditization of MPLS will require common information
> modeling of management and control of OAM functionality. This
> will be reflected in the the integration of standard MPLS-related
> MIBs (e.g. [LSRMIB][TEMIB][LBMIB][FTNMIB]) for fault, statistics
> and configuration management. These standard interfaces
> provide operators with common programmatic interface access to
> operations and management functions and their status.
>
>No title. What is the requirement?
How about, "standard management interfaces"
> 3.9 Detection of Denial of Service attacks as part of security
> management.
>
>No section body text. Can you elaborate a little bit?
Will have to work that one offline...
> 3.10 Per-LSP Accounting Requirements
>
> [snip]
> (1) Collecting information to design network
>
> For the purpose of optimized network design, SP offers that the
> traffic information regarding among POP and/or router.
> Optimizing network design needs this information.
>
> (2) Providing high-level SLA
>
>What does high-level SLA mean? Does it mean stringent?
>
> 3.10.1 Requirements
>
> Accounting on a per-LSP basis encompasses the following set of
> functions:
>
> (1) At an ingress LSR accounting of traffic through LSPs
> beginning at each egress in question.
>
> (2) At an intermediate LSR, accounting of traffic through
> LSPs for each pair of ingress to egress.
>
>I think by intermediate LSR you mean P routers. How can a P router provide
accounting for every LSP from ingress to egress >(PE-to-PE)?
It can provide counts for any LSP visible to it. (e.g. a P will not see PW
traffic but will see PSN traffic, PW LSPs are not visible to P LSRs).
>Editorial:
>---------
>
>Incomplete sentences (e.g., missing verbs), too long and hard to follow
sentences, grammatical errors ("MUST be have", >"can may be").
>I think it is better to use Service Level Specifications of SLAs rather
than SLAs, as an SLA includes financial agreements >as well.
>PWE3 Framework is no more, instead you better use PWE3 Architecture as a
reference.
>
>General:
>--------
>
>I find the draft hard to follow.
We'll see what we can do...
cheers
Dave
|
|