The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2006-Apr> msg00061



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

[mpls] I-D ACTION:draft-ietf-mpls-p2mp-lsp-ping-01.txt

  • From: "Adrian Farrel" <adrian@olddog.co.uk>
  • Date: Thu, 13 Apr 2006 15:37:59 +0100
  • X-OriginalArrivalTime: 13 Apr 2006 14:39:32.0062 (UTC)FILETIME=[13A55FE0:01C65F08]

Hi Ben,
 
> 1) Why is it restricted to P2MP TE LSPs?  mLDP signalled
> LSPs will also require OAM and it's not clear to me why
> they have been precluded by this draft.
 
Well, this draft does *not* preclude mLDP.
There are some factors to consider.
a. When this draft was started, mLDP did not exist
b. When this draft was developed there were two competing
    mLDP solutions drafts and no clear definition of mLDP
   FECs.
c. The existing MPLS-TE P2MP techniques are extensible
    (by use of a FEC) but it is not clear that the problem
    space is the same with regard to scalability for mLDP.
    In particular, the ingress of an mLDP tree is (perhaps)
    not aware of the identities of the egresses and so it is
    hard to ping a single egress along the tree.
d. What would be wrong about having two drafts for this?
 
Actually, at the last couple of IETFs (and on the list, if I recall) I have asked the question about whether to roll this all into one I-D or to keep it separate, and I have not had anything in the way of an answer.
 
At the moment, we are inclined to keep the two pieces of work separate for cleanliness. But doing so does not imply that anyone thinks that mLDP should be without OAM.

> 2) Section 4.1 of draft-ietf-mpls-p2mp-oam-reqs-01 states
> "The ability to detect defects in a P2MP LSP SHOULD not
> require manual, hop-by-hop troubleshooting of each LSR
> used to switch traffic for that LSP, and SHOULD rely on
> proactive OAM procedures (such as continuous path
> connectivity and SLA measurement mechanisms)." whereas
> the comments in draft-ietf-mpls-p2mp-lsp-ping-01 imply
> to me that proactive/continuous OAM may not be possible
> with the proposed solution due to scaling concerns
> resulting from the possibility of flooding the ingress
> with responses.
 
Hmmm.
May be there are some semantics around "proactive" and "continuous".
I think that these terms could be taken to mean that you should be able to initiate a "probe" at the ingress and rely on the probe mechanism to determine the existence of and location of the defect. This should happen without the operator having to initiate a probe hop-by-hop through the network along the path of the LSP.
 
But we should also take into account that the proposed P2MP LSP Ping is not intended to be a solution for all requirements in the P2MP OAM Requirements I-D. Indeed, section 4.1 of the P2MP OAM Requirements I-D is quite clear about the issues of using LSP Ping for defect detection (compared with section 4.2 that describes defect diagnosis).
 
Whether some other mechanism needs to be added to the toolkit (compare with BFD for P2P LSPs) for error detection is up for grabs, and I am sure that the working group would welcome contributions.
 
> As well as the ability to delay/jitter
> the echo responses has consideration been given to other
> potential ways to improve scalability for example using a
> Nack model for responses, somehow aggregating responses
> etc.?
We did discuss a Nack model. I am not sure if it is what you are talking about.
In brief we considered...
- OAM is enabled (see later)
- Egress expects to receive "ping" every time unit
- In the absence of a "ping" the egress sends an alert
 
We felt that OAM should not be permanently active on what may be a low bandwidth service for fear of being the main consumer of traffic. Therefore, OAM must be enabled by an explicit act (and not implicitly enabled by the establishment of the LSP).
 
Thus, we can say that OAM is enabled by the first ping. We would, of course, need a positive ack to that event. After that, we can resort to Nack processing. But I think there is some handshaking needed to make sure that the positive ack has been received.
 
Disabling is more of a worry. If the disable instruction does not reach an egress, it will continue to generate Nacks. Thus we would need some form of back-off on Nacks dropping quite quickly to assuming that OAM is disabled.
 
All of this is achievable, and could be built on top of LSP ping, but what is the cost/benefit? It is certainly a more complicated process, and I don't hear anyone suggesting that they plan to implement it. But if someone is working on such a solution, I'm sure we'd be happy to fold it in to our draft.
 
 
Aggregation of responses was also briefly discussed. We have two problems with this.
 
1. How does the transit node know how long to hold on to a response in the hope that another will come along? In fact, won't any delay for aggregation simply serve to destroy the benefits of jitter?
 
2. What makes us assume that responses will follow paths that make aggregation practical?
 
 
Cheers,
Adrian
_______________________________________________
mpls mailing list
mpls@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/mpls