The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2002-Dec> msg00140



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

Last call on LSP Ping - ECMP thread (continued)

  • From: "Gray, Eric" <egray@celoxnetworks.com>
  • Date: Mon, 9 Dec 2002 17:32:36 -0500
  • Cc: "'erosen@cisco.com'" <erosen@cisco.com>, mpls@UU.NET

Curtis,

> Actually src/dst addresses are used and the hash is over those 64
> bits.  Some implementations use (or used) the port numbers as well.
> The idea was to split up the traffic but make sure no IP microflow was
> reordered.

	This is an unjustified assertion about implementations
in general.  You are not in a position to make assertions about
how all implementations do this.

> The normal HASH is over a 64 bit quantity.  ...

	Again, this is an assertion though how anyone might 
determine what the "normal" hash algorithms might include
is beyond me.

> > 	     My point is that the usefulness of this address
> > range depends on an assumption that ECMP hash algorithms
> > will include some part of the least significant bytes of
> > the destination IP address.  This is a grave restriction
> > on the use of ECMP algorithms and should not be encouraged.
> > If you cannot make this assumption, then the existence of
> > this address range would not be useful and would in fact
> > be misleading.
> 
> This is a safe assumption, but if not, then rather than a 127/8
> address, the midpoint must return some indication that no address in
> the range will allow the diagnostic.  The indication could be 0.0.0.0.

	This might make sense.

> If there is a router that does ECMP and uses an IP hash that doesn't
> include the lower 24 bits, I'd be really surprised.  I don't think
> such a thing exists or that there is any good reason to build such a
> thing and therefore your objection has no merit.

	Then you should be prepared to be surprised.  In
general, it is sufficient to group flows using either a
source or a destination IP address based hash.  Further, 
it is of some importance to use a range of algorithms, or
(non-exclusive) a variation of key sources/patterns so 
that it is possible for the same source and destination
to map to a different has key at different places in the 
network.  Finally, it is very important to keep the hash 
algorithms simple.

	Consequently, it is possible that you may discover
that some schemes use only a small-ish set of bits from
either the source or destination IP address, depending 
on the location of the device, and the direction of packet 
flow, relative to the network core.  Even if the entire
64 bits of source and destination IP address is injected 
into some algorithms, it is possible that steps in the
algorithms themselves may ignore some of those bits.

	If you believe that my objection still has no merit,
then we are unlikely to find common ground for further
discussion - at least with respect to this issue.

> The problem of multiple ECMP branch points remains a problem which
> nothing proposed so far addressed adequately (unless you count
> spraying over the whole 127/8 range as addressing this).

	Address spraying would not adequately address this 
either.

> 
> > 	However, if you proposed a more straight-forward
> > approach - requiring an ECMP origination point to (some
> > day) support the ability to recognize a Ping packet and
> > (presumably on it's own initiative) produce its own Ping
> > packets for all equal cost paths - that approach does
> > seem useful and we should think about it some more.
> 
> I thought I just did propose something (not in the attached message
> but in a prior message) that unlike what you state above requires no
> change to forwarding.  It requires only a change in MPLS TTL expired
> processing if the payload is MPLS-ping (which is already required).
> It is really spread over three messages.  Do I need to resend it in
> one message?

	Either that, or point out which three messages.

Eric W. Gray
Systems Architect
Celox Networks, Inc.
egray@celoxnetworks.com
508 305 7214


> 
> > Eric W. Gray
> > Systems Architect
> > Celox Networks, Inc.
> > egray@celoxnetworks.com
> > 508 305 7214
> 
> Curtis
> 
> 
> > > -----Original Message-----
> > > From: Curtis Villamizar [mailto:curtis@fictitious.org]
> > > Sent: Monday, December 09, 2002 2:13 PM
> > > To: Gray, Eric
> > > Cc: 'curtis@fictitious.org'; 'erosen@cisco.com'; mpls@UU.NET
> > > Subject: Re: Last call on LSP Ping
> > >
> > >
> > > In message <1117F7D44159934FB116E36F4ABF221B0267ED5B@celox-ma1-
> > > ems1.celoxnetwor
> > > ks.com>, "Gray, Eric" writes:
> > > > Curtis,
> > > >
> > > > 	I am suggesting that attempting to use any address range
> > > > as a mechanism for testing ECMP routes is a mistake.  I am also
> > > > suggesting that a single address is sufficient to satisfy the
> > > > need Eric Rosen points out.  Finally, if you were suggesting
> > > > that 127.0.0.1 should be specified (as opposed to 127.0.0.0/8)
> > > > for the same purpose, then I agree with you.
> > > >
> > > > 	Any address based approach to trying to test ECMP load
> > > > sharing paths implies prior knowledge of all ECMP algorithms
> > > > in use and is therefore broken.
> > > >
> > > > Eric W. Gray
> > > > Systems Architect
> > > > Celox Networks, Inc.
> > > > egray@celoxnetworks.com
> > > > 508 305 7214
> > >
> > >
> > > If you have ideas to address the prior message I sent on ECMP and in
> > > particular the problem of multiple branch points, I'd like to hear
> > > your solution.
> > >
> > > One thing that came to mind was the midpoint enumerating the possible
> > > paths with indices and the ingress providing the index in subsequent
> > > messages.  To do this, the ingress may have to bundle the better part
> > > of an MPLS-ping message inside another so the top POP exposes an IP
> > > packet and ECMP index to use and the packet inside is injected into
> > > the stream.  If we go with this, then 127.0.0.1 would be fine.  This
> > > has the disadvantage of not actually exercising the forwarding, just
> > > the control plane.
> > >
> > > The 127.0.0.1 address has a small advantage in that if the egress
> > > comes out in some unexpected place, then it is delivered to that
> > > router which can respond more intellegently than sending an ICMP TTL
> > > expired.
> > >
> > > Curtis