The MPLS WG Archive[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index][Thread Index][Author Index][Subject Index] [mpls] Multicast in BGP/MPLS IP VPNs
Now that IP multicast for L3VPNs has been officially added to the charter, I
hope the WG will adopt draft-rosen-vpn-mcast as a WG draft. It's been
around for years, it has significant deployments, there are implementations
from more than one vendor, and they even interoperate in many cases.
We recently posted the latest version of this, draft-rosen-vpn-mcast-07.txt.
This differs from the -06 version in the following respects:
- The sections describing alternative schemes which were never implemented
and are not currently under consideration have been removed.
- When a PE receives a Join from a CE, the PE must determine which of the
other PEs is the PIM adjacency to which the Join must be forwarded. The
procedure in -06 did not always work when the two PEs were in different
Autonomous Systems. The procedure in the -07 version works in all cases.
- For multi-provider multicast domains, the procedures described in -06 did
not include the use of PIM-SSM ("Single Source Mode") in the provider
network. We have come to believe though that PIM-SSM is very important in
the inter-provider case, as (a) it eliminates the need for providers to
coordinate their use of multicast addresses, and (b) it eliminates the
need for multiple providers to either coordinate on the use of a PIM
Rendezvous Point (RP) or to deploy MSDP. The procedures of -07 provide
full support for PIM-SSM. In -07, PIM-SSM is also made a "required to
implement".
- For a multi-provider multicast domain, -06 presupposed that all the PEs of
one provider's network would be routable in the IGP of the other
providers' networks. Version -07 has procedures which remove the need for
this presupposition.
- Version -06 hints at a method for increasing replication efficiency (at
the expense of additional state) by allowing more than one multicast
distribution tree for each VPN. Version -07 provides considerably more
detail on how this is done.
Version -07 is the first version in years to have substantive changes, so I
hope anyone interested in MVPN will look at it carefully, in preparation for
a discussion in San Diego.
Version -07 also references a couple of other relatively new drafts,
draft-nalawade-idr-mdt-safi-00.txt, and draft-wijnands-pim-proxy which
specify BGP and PIM extensions respectively that are used by some of the new
procedures for inter-AS and for BGP-free cores.
I would also like to take note of draft-raggarwa-l3vpn-2547-mvpn-00.txt.
This is mostly a cut-and-paste of draft-rosen-vpn-mcast-06.txt, which
purports to describe "the minimal set of procedures required to build
inter-operable implementations of multicast support for BGP/MPLS VPNs".
I notice that one of the cut-and-paste operations was to cut out the list of
authors and replace it with a new list! As a result, many of the sentences
seem to have been written by someone who isn't on the list of authors. Of
course, the new "authors" do not claim to be producing original work, but it
is a bit unusual to take someone else's work, strip it down, rearrange it a
bit, and then put one's own name on it.
However, the point that draft-rosen does not clearly specify the "minimal
set of procedures" for interoperability is well taken. The WG should
certainly decide which of the procedures specified therein are MUSTs and
which are optional. The two drafts make some different choices about this,
though the number of substantive differences are really quite small and they
do not appear to be fundamental. Therefore I would propose to have an
explicit discussion of the differences. The WG should resolve those
differences, and subsequent versions of draft-rosen (or draft-ietf-l3vpn-
whatever) should reflect the WG's decisions.
What draft-raggarwa really does is describe a particular implementation,
with the suggestion that the IETF standard should require only those
procedures which are deployed today as part of that implementation.
Everything else is "out of scope". That doesn't seem to me to be the right
approach; the WG should be free to produce a standard with required features
that go beyond today's "pre-standard" implementations.
Anyway, here is what I think the differences are:
- draft-raggarwa presents a procedure for multi-provider multicast domains
which does not work in conjunction with option b of section 10 of
rfc2547bis. Compatibility with option b is declared out of scope, which
seems somewhat arbitrary. I don't know what they think about draft-
rosen's solution for this, but certainly some such solution should be
REQUIRED in the standard. Draft-raggarwa also does not present a
procedure which works within a BGP-free core ("out of scope"), whereas
draft- rosen does; it certainly seems reasonable for the standard to
specify (and even require) such a procedure.
- draft-raggarwa has the following somewhat confusing paragraphs:
"Determination of the C-Join PIM neighbor address i.e. the RPF
neighbor address needs to be further explained. This depends on the
procedure used to assign an address to the MT interface. The address
of this interface MUST be the BGP next-hop address of the unicast VPN
routes advertised by the MD VRF. This will typically be a PE loopback
address in the provider address space.
To determine the C-Join neighbor address, the PE does a route lookup
on the C-Source address. This address is a VPN unicast route learnt
from the PE sitting in front of the multicast source. The route
lookup results in the BGP next-hop of the C-source VPN unicast route.
This BGP next-hop is the neighbor address to use while sending the
PIM-Join."
They are correct to assert that, in the absence of additional procedures
(such as, e.g., use of the "connector" attribute discussed in draft-rosen)
things won't work unless the MT interface address is the same address used
by the BGP session which is advertising the VPN-IPv4 routes from the VRF
with which the MT is associated. So (in the absence of additional
procedures) it should be a MUST for these "two" addresses to be the same.
(In some cases this can be automated, it others it might require
configuration.) But I don't know what they mean by saying that finding
the RPF address "depends on the procedure used to assign an address to the
MT interface". The problem is that determining the RPF address DOESN'T
depend on how the MT interface addresses were assigned, it depends only on
the BGP next-hop attributes. Thus if the MT interface addresses are not
assigned "correctly", the RPF address is not determined correctly, and
things just don't work.
Another problem is that this way of determining the RPF address is exactly
the thing that doesn't work in conjunction with option b of section 10 of
rfc2547bis.
Section 6.2 of draft-rosen has a more complete discussion of this, but
could probably do a better job of clarifying the relation between the MT
interface address and the address used by BGP.
The relevant procedures of draft-rosen make use of a new BGP address
family, MDT-SAFI, and a new BGP attribute, "connector", specified in the
draft-nalawade cited above. I see that the very same idea is proposed, in
a slightly different manner, in newer draft-raggarwa-bgp-nexthop-rewrite-
00. In the latter draft it is called the "original next hop" attribute
instead of the "connector" attribute, but the only real difference is that
the encoding proposed there does not encode an address family identifier,
while the encoding in draft-nalawade does. (Also, they don't actually say
that they intend to use this for multicast VPNs, but I presume that they
do.) By not encoding an address family, they implicitly rule out the use
of the MDT-SAFI address family, so I think the real underlying issue here
is whether the MDT-SAFI address family is really necessary or not. We
should discuss this explicitly rather than trying to hide it behind
encoding issues. The MDT-SAFI address family is proposed to be used for
PIM-SSM support, and to facilitate the creation of a multicast topology
which is "non-congruent" to the unicast topology. I think we'd need to
see alternate proposals for the support of this functionality before we
could just eliminate that address family.
- There is a difference between the two drafts with respect to which of the
PIM variants should be a "required to implement".
Draft-raggarwa requires PIM-SM (sparse mode), as it is the variant
deployed by the implementation which this draft describes, and a
scalability advantage over other PIM variants is claimed.
Draft-rosen requires PIM-SSM (single source multicast), as we believe this
to be the most feasible way of supporting multi-provider VPNs at scale.
The advantages of PIM-SSM for multi-provider VPNs are two:
* PIM-SSM is the only PIM variant which does not require that the
SP assign a unique multicast group address to each VPN; rather, only the
order pair <PE address, multicast group address> has to be unique. With
any other PIM variant, the SPs have to coordinate on the multicast
address assignment, and the assigned addresses have to be distinct from
any other multicast addresses which may occur in any of the SP networks
through which the VPN traffic passes. In rfc2547bis we go to great
lengths to ensure that it is easy to assign addresses uniquely without
coordination, and only PIM-SSM allows this characteristic to be carried
over to multicast. (Multicast address assignment has long been one of
the thorniest problems for the deployment of multicast generally.)
* PIM-SSM eliminates the need to deploy either multi-provider "Rendezvous
Points" (RPs) or MSDP. The main purpose of the RP in PIM-SM or of MSDP
is to enable the receivers to discover the transmitters. However, the
auto-discovery function for the sort of VPNs we are considering is
generally done via BGP, and there's really no need to require other
protocols to be used solely for the purpose of multicast auto-discovery.
Draft-raggarwa does not address the first of these two points, but does
address the second:
"Even if the ASs are under control of multiple service
providers, the level of cooperation required to offer even plain
unicast 2547 VPN service is high enough, which means that one more
issue (ownership of RP) may not be a significant addition to what is
already required. And if that is the case, the providers can share
RPs, and MSDP is not required. If each provider insists on having its
own local RP, MSDP can be used between the RPs that belong to the
different providers."
Frankly, this defense seems rather half-hearted to me.
Draft-raggarwa does have a discussion (section 5.1) about "switching from
shared to source specific trees in the SP network" where it is suggested
that scalability can be improved by using the shared tree rooted at the RP
rather than by using source trees. While the use of shared trees does
help to scale the amount of multicast routing state in the network, a
better way of using a shared tree would be to use PIM-BIDIR, which
eliminates the need to first send the multicast data packets to the RP.
Be that as it may, the discussion in section 5.1 is noted as being "a
local implementation choice" that "does not impact interoperability".
Since draft-raggarwa purports to describe only "the minimal set of
procedures required to build inter-operable implementations", it is not
clear why section 5.1 is even there.
There are some other new things in draft-rosen which don't conflict directly
with anything in draft-raggarwa, such as the support for "Data MDTs", so I
won't discuss them in this note. However, ultimately we will need to decide
whether these are mandatory or optional procedures (assuming that the WG
agrees to include them in the standard).
My hope is that the technical issues raised above will be discussed on the
list and/or in San Diego, with the goal of producing a single WG document,
specifying the complete set of procedures as well as the set of required
procedures shortly thereafter.
_______________________________________________
mpls mailing list
mpls@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/mpls
|
|