During today's MPLS session it was clear that more
discussion/clarification was required on the problem space addressed by the
I-D.
There is two aspects to the problem discussed in the
draft.
1) Mixing e2e path testing with adjacency fault detection has
coorindation problems when failures occur. If I have an LSP hierarchy that I
am pinging at some low level, and a failure occurs, some number of the pings
will be affected, and a response/alarm management will be hard to synchronize
as ping sources are not local to the fault (and the fault may also be detected
locally, e.g. link failure). IMHO this is an artifact of the delta between
using ping to reactively to check routing policy vs. using ping proactively to
detect data plane failures.
NH=> The simple short-term fix to the problems
mp2p creates I believe is as follows:
- mp2p merging in any co
pkt-sw technology forwarding means we lose sight of the source, and
architrecturally this should tell us this is a deprecated topology in such a
mode......BTW, IP *never* uses merging it is always muxing because it belongs
to the cnls mode where each/every pkt has a SA/DA and so is fully
resolvable. So unless the *immediate* client is IP (but then there is
little value-add from MPLS LDP variety here IMO) there are always p2p LSPs (in
one form or another) above the mp2p LSPs......this is a
forced consequence of the mp2p merging to effect demerging.
- so run
some decent fault management continuously on the p2p LSPs (like Y.1711) and if
this shows a problem then:
* check the L1
trails.....if these show a correlated fault then done....but if these show
'clear', then
* wheel-out more complex ad hoc tools like
LSP-Ping to go hunting through the mp2p LSPs.
Note - I am actually happy to live with LSP-Ping, and
the fact its virtually impossible to define any meaningful defect states
(entry/exit criteria and consequent actions) or QoS behaviour for a mp2p
entity, in the *short-term*....we have it and have to live with it. But
that does not mean I have to keep building on it and pretending its all
fine for the future too.....long-term this (and other architectural errors
like PHP) simply have to go. So I won't accept kludge on kludge from
here on it, like the latest PID fix and the further PW OAM proposals.....this
last one is an elastoplast too far. MPLS could/should have an
important future which adds value to services providers......if all one does
is try and replicate the IP layer (which you can't anyway as MPLS uses co
pkt-sw forwarding mode) we are never going to realise the protential on
MPLS. Maybe that is what some people want?
2) Reserved labels define functions instead of
forwarding.
NH=> Agreed. This is not a good solution
IMO......however, what do you do when the header is functionally
deficient?....also see later remark.
Fate sharing
between functions and LSPs is useful. Currently ECMP breaks fate sharing
between LSPs and LSP functions defined by reserved labels.
NH=> Agreed. And when one considers the
ethernet VPLS stuff over this there will be a request to *turn-off* the ECMP
since one cannot tolerlate uni-directional failings here....and ECMP creates a
richer environment for this. Bottom line is load-sharing is OKish within
a *single* layer network (like IP) but it will lead to problems if done over
>1 layer network (ie MPLS). And of course it has, ergo the PID
kludge. Right answer of course is to introduce proper p2p TE
capabilities to bypass the SPF/IGP routes in a known/consistent
manner.
The answer to #1 is to try to localize detection of data plane
problems, the ability to do the equivalent of the router alert label that fate
shares with the LSP is one potential mechanism that could be used to increase
locality in detecting data plane problems. Specific data plane flows could be
inspected for consistency by intermediate LSRs as they were forwarded down the
path.
The answer to #2 is to provide an alternative to reserved
labels for LSP functions, the MPLS/PW PID is a candidate for doing this. The
'A' bit was one example of providing such functionality by defining a
replacement for router alert. A side note is that there is a much higher
liklihood of commonality of forwarding of a solution that had the label of
interest as the top label vs. prepending with the router alert
label.
NH=> Dave I agree.....however, if we can one-day
recognise mp2p merging as 'bad' then maybe we could properly set the OAM
indicator where it really should functionally reside....as part of the
header.....say using the TTL field which would now be effectively redundant (8
bits is way overkill anyway). In the meantime, I could live with your
proposal if (i) it was recognised as a fix to a problem lying elsewhere, with
the ultimate goal of removing the real problem at some stage and (ii) we
simply use the PID without the other CW stuff (which I regard as largely
irrelevant for a properly engineered/architecturally-valid MPLS
network.....but if the Seq. No. stuff stays then it *must* have entry/exit
defect states and consequent actions associated with it fully specified, it is
unacceptable to leave this open-ended IMO speaking as an
operator).
Comments?
cheers
Dave