The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2002-Oct> msg00097



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

Latest MPLS Ping draft...

  • From: neil.2.harrison@bt.com
  • Date: Fri, 18 Oct 2002 11:49:21 +0100

Some general and specific observations on the latest draft 01.  

General comments:
-----------------

1	There are (at least) 3 phases to consider:
(i)	automatic detection of defects
(ii)	automatic consequent actions on detecting defects (these may differ
between different defect types)
(iii)	ad hoc diagnostic tools to investigate anomalous behaviour (esp
where such tools are either too complex/processing-intensive or are simply
not needed in the general 80/20 case)

The draft does not identify/specify any defects, and therefore by inference
does not give any (automatic) consequent actions.  This is not a criticism
per se, just an observation of its limitations.  It therefore falls more
naturally in the ad hoc diagnostics category.

2	IP/cnls networks cannot suffer from the same defects as a MPLS
network.  This is because an IP pkt contains full source/destination
addresses, ie each/every pkt contains it's own 'connectivity verification'
handle that is networkwide unique.  LDP LSPs however use only link-local
unique identifiers to effect hop-by-hop routing.  It is therefore possible
for MPLS networks to misdirect traffic that will not 'self-correct' any
misrouting (as it would in the IP case).  When using link-local identifiers
one should consider all these defects:
-	simple breaks
-	swapped connections
-	traffic from LSP A leaking into LSP B say (in simplest case
involving only 2 LSPs), which has 2 modes:
	* LSP A traffic is otherwise unaffected, ie despite misbranching
into LSP B, LSP A's traffic also carries on to its proper destination....so
there is no indication of problems on LSP A and only LSP B is affected
	* LSP A suffers a break, ie as a consequence of misbranching into
LSP B, LSP A's traffic does not carry on towards its proper destination, so
here both LSP A and LSP B are affected.

The need to specify these defects (in terms of entry/exit criteria and
consequent actions) is required in Y.1710 and very simple techniques are
given in Y.1711 as solutions for the p2p case.  These defects should ideally
also be specified for the mp2p case of LDP. Besides the obvious need for
operators to be able to detect/handle these defects automatically, there are
other reasons why this is required, such as:
-	the need to suppress traffic where necessary to protect customer
traffic confidentially, eg swapped LSPs;
-	in cases such as ATMoverMPLS the need to provide indications (from
the LSP sink point where the defect is detected/handled) to the client (ATM
here) network in order to suppress alarm storms at the sink points of the
client network trails (these may be in a different operator network to the
MPLS network and potentially terminate at many geographically diverse
locations);
-	to get some common ground for interworking between vendors and/or
between operators its clearly important to have common specifications for
defects and their handling.

If this is not possible (for whatever reasons, eg mechanisms defined or
network technology considered) then these limitations should be indicated so
that operators can understand the scope of application.

3	If we can't define standardised defects for whatever reasons, then
there are further limitations that should also be noted:
-	it will not be possible to specify standardised availability
entry/exit criteria
-	it will not be possible to specify standardised QoS objectives, and
in particular their apportionments across operator domains.....this arises
since one needs the availability 'up-time' datum against which QoS metric
collection is valid, ie collection of such metrics against long-term QoS
objectives should be suppressed during unavailability.

Note - the above are not meant as criticisms of the LSP-Ping mechanism per
se (indeed I think its a useful diagnostic tool), they are merely
observations of the limitations of its scope.....esp in the context of mp2p
LSP constructs, which is probably the more limiting factor here.

Specific comments on the draft:
-------------------------------

4	Towards the end of section 2 the draft talks about '....periodically
ping a FEC...' and 'If a ping fails...'.  I think its fine to leave these
undefined like this so long as we agree that LSP-Ping is really a
diagnostics tool, and will have limitations wrt its scope as a defect
detection/handling tool or for any consequential use in measuring
availability/QoS.  However, if this is not the case then I think
well-defined defect specifications (for the reasons given previously) will
be required .....if this is indeed possible.

5	I was not clear why the opening sentence of section 3 says '...a
(possibly) labelled UDP packet.' (a similar sentence appears at the start of
section 4.1).  I thought the intention was to exercise the data-plane of the
outgoing LSP and so the pkts MUST be labelled.  Can you please clarify what
is meant here?

6	In section 3 the 'Reply Mode' field is only specified for IPv4
packets.  Is there any reason the possible use of IPv6 packets has been
omitted here, ie when using LSP-Ping in an IPv6 network? 

7	Section 3.3 discusses the 'Pad TLV'......what is the intended use of
this?

8	In section 5 reliable return paths are discussed.  Agreed that
defect detection/diagnosis/handling is a problem when attempted as a
bidirectional function.  When used with RSVP signalled p2p LSPs, I note
there is an option to use the RSVP return LSP RESV message to carry back
information of other direction Pings.  Two observations:
-    forced coupling with a specific signalling protocol.....this is not a
good idea IMO, the data-plane defect handling ought to look after itself
independently of the control-plane routing/signalling protocols being used;
-    will only detect simple breaks, eg if pkts leaking out of forward
direction of LSP there will not be a return RSVP instantiated LSP (to
wherever they end up) to indicate this defect exists..........this will
require the LSR where the pkts end up to use an IP return mode to detect.
Given this, why bother with having a RSVP return mode in the 1st
place....just use the IP-only return mode in all cases?

BTW - when we are dealing with p2p entities, like an ER LSP, consideration
should be given to the use of the less complex and more powerful techniques
given in Y.1711....esp so in case of non-IP client layers.

That's all the comments I have for now.

thanks and regards, Neil