The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2002-Dec> msg00212



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

Last call on LSP Ping - ECMP thread (continued)

  • From: Curtis Villamizar <curtis@fictitious.org>
  • Date: Mon, 09 Dec 2002 18:02:24 -0500
  • cc: "'curtis@fictitious.org'" <curtis@fictitious.org>, "'erosen@cisco.com'" <erosen@cisco.com>, mpls@UU.NET


In message <1117F7D44159934FB116E36F4ABF221B0267ED64@celox-ma1-ems1.celoxnetwor
ks.com>, "Gray, Eric" writes:
> Curtis,
> 
> > Actually src/dst addresses are used and the hash is over those 64
> > bits.  Some implementations use (or used) the port numbers as well.
> > The idea was to split up the traffic but make sure no IP microflow was
> > reordered.
> 
> 	This is an unjustified assertion about implementations
> in general.  You are not in a position to make assertions about
> how all implementations do this.
> 
> > The normal HASH is over a 64 bit quantity.  ...
> 
> 	Again, this is an assertion though how anyone might 
> determine what the "normal" hash algorithms might include
> is beyond me.

It comes from a bit over a decade working in Internet operations and
in IP protocol development and experience with a large number of
routers.  Plus it is documented in RFC 2992.

  2992 Analysis of an Equal-Cost Multi-Path Algorithm. C. Hopps.
       November 2000. (Format: TXT=17524 bytes) (Status: INFORMATIONAL)

I think that justifies the assertion.

> > > 	     My point is that the usefulness of this address
> > > range depends on an assumption that ECMP hash algorithms
> > > will include some part of the least significant bytes of
> > > the destination IP address.  This is a grave restriction
> > > on the use of ECMP algorithms and should not be encouraged.
> > > If you cannot make this assumption, then the existence of
> > > this address range would not be useful and would in fact
> > > be misleading.
> > 
> > This is a safe assumption, but if not, then rather than a 127/8
> > address, the midpoint must return some indication that no address in
> > the range will allow the diagnostic.  The indication could be 0.0.0.0.
> 
> 	This might make sense.

OK.

> > If there is a router that does ECMP and uses an IP hash that doesn't
> > include the lower 24 bits, I'd be really surprised.  I don't think
> > such a thing exists or that there is any good reason to build such a
> > thing and therefore your objection has no merit.
> 
> 	Then you should be prepared to be surprised.  In
> general, it is sufficient to group flows using either a
> source or a destination IP address based hash.  Further, 
> it is of some importance to use a range of algorithms, or
> (non-exclusive) a variation of key sources/patterns so 
> that it is possible for the same source and destination
> to map to a different has key at different places in the 
> network.  Finally, it is very important to keep the hash 
> algorithms simple.

That is the purpose of the random key to the hash, periodically
updated (very long interval).

> 	Consequently, it is possible that you may discover
> that some schemes use only a small-ish set of bits from
> either the source or destination IP address, depending 
> on the location of the device, and the direction of packet 
> flow, relative to the network core.  Even if the entire
> 64 bits of source and destination IP address is injected 
> into some algorithms, it is possible that steps in the
> algorithms themselves may ignore some of those bits.

A good hash does not ignore any of the bits.

> 	If you believe that my objection still has no merit,
> then we are unlikely to find common ground for further
> discussion - at least with respect to this issue.

Can you please name such a router.  I know it is not Cisco, Juniper,
or Avici.  Nor the old Bay routers, or Ironbridge, or Pluris or the
former Argon.  Some of the newer edge boxes may have done this.  If
so, can you please name one?

> > The problem of multiple ECMP branch points remains a problem which
> > nothing proposed so far addressed adequately (unless you count
> > spraying over the whole 127/8 range as addressing this).
> 
> 	Address spraying would not adequately address this 
> either.

Such a router would be quite broken not because MPLS-ping wouldn't
work but the oad split wouldn't be very good.  If making a purchasing
decision, I'd hand the maker a copy of RFC2992 and wait for them to
fix the routers or go elsewhere.

> > > 	However, if you proposed a more straight-forward
> > > approach - requiring an ECMP origination point to (some
> > > day) support the ability to recognize a Ping packet and
> > > (presumably on it's own initiative) produce its own Ping
> > > packets for all equal cost paths - that approach does
> > > seem useful and we should think about it some more.
> > 
> > I thought I just did propose something (not in the attached message
> > but in a prior message) that unlike what you state above requires no
> > change to forwarding.  It requires only a change in MPLS TTL expired
> > processing if the payload is MPLS-ping (which is already required).
> > It is really spread over three messages.  Do I need to resend it in
> > one message?
> 
> 	Either that, or point out which three messages.


I'll just resend them as one.

Curtis


> Eric W. Gray
> Systems Architect
> Celox Networks, Inc.
> egray@celoxnetworks.com
> 508 305 7214