The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2003-Nov> msg00128



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

On ECMP

  • From: Curtis Villamizar <curtis@workhorse.fictitious.org>
  • Date: Tue, 18 Nov 2003 00:44:02 -0500
  • cc: Kireeti Kompella <kireeti@juniper.net>, mpls@UU.NET


In message <200311180412.XAA11121@hank.bcentralhost.com>, mark seery writes:
> 
> > I don't know anything whatsoever about cisco's hash, and don't have
> > the least interest (from a 'will routing break?' point-of-view).  I am
> > willing to bet that J and C (and A) have very different hash
> > functions.  It Just Doesn't Matter (tm).
> 
> If this is true, then why has the use of a PID been suggested to bring in to 
> play at least some commonality. Now it has been suggested to me offline that 
> some of my comments are really only relevant in a highly multi-protocol world
> , and that world does not exist in the real world. If this is one of the caus
> es of the disconnect, I understand where people are coming from.


Strictly speaking, a microflow is a stream of data which cannot be
reordered (without some sort of bad things happenning).  For IP a
microflow can be identified by source and destination address and TCP
(or UDP) port numbers for TCP and UDP.

If a hash is performed on an IP source and destination address
(src/dst addr), and the path selected from a set of N choices using
modulo N (simple and common technique) then each path will have a set
of microflows, generally a very large set.  Any one microflow will
take only one path and therefore packets for that microflow will not
be reordered.

If you don't know that traffic is IP, then it is not generally safe to
try to split it up.  An LSP carrying the data stream for a single
leased line cannot be load split because reordering the packets would
have dire consequences.  When this type of LSP and some IP LSPs are
put inside another LSP (hierarchical LSPs) then there are two labels
in the label stack.  In this case, it is not possible for a midpoint
to tell by looking at the labels which packets are IP and which are
not.  Load split can be done on the inner label alone (the outer label
is constant) but the quality of the load split is generally
unsatisfactory.

It is desireable to be able to load split the IP traffic based on the
IP src/dst.  If the ISP knows that any non-IP traffic will never have
what looks like an IP header in the first four bytes, the ISP can
enable ECMP in which the first four bytes are examined to see if the
packet is IP, and labels are used for the split if not, and the IP
src/dst is used for the split if it is IP.  If the vast majority of
the traffic is IP, then this yields a high quality load split.

The PW PID just provides a means to insure that for PW the above
condition is true (it insures that no non-IP traffic will have the
first four bytes that look like an IP header).  That is the only link
between PW and this discussion.

I made some comment in private email to Mark about multiprotocol.  A
lot of ISP/carriers have in their MPLS traffic IP, usually VOIP (also
IP), maybe IP VPN (also IP), and maybe FR, ATM, and TDM.  ECMP is
guarenteed to not scramble microflows if the FR, ATM, and TDM go
inside PW with PW PID (or otherwise acquire a 4 byte shim that doesn't
look like an IP header in the first four bytes or first one byte which
always contains the version number, 4 or 6).  For multiprotocol in the
real world (of wide area networking) this is about it.  OSI, X.25,
ISDN, Novell, Appletalk, Netbios, SNA, DECNET, etc, just plain don't
exist in the core.  Yes, I did mention OSI in that list, but if an ISP
needs OSI for some dribble of control packets for some archane reason
(or any of the others), then they can enscapsulate it in IP to keep
its packet ordering intact in the presence of ECMP.

Curtis


  • References:
    • On ECMP
      • From: mark seery <mark@interflect.com>