Segment routing basics on IOS XRv

July 9, 2014 2 comments

Well the day has finally come, segment routing is in the land of the hardware-have-nots with the release of demo image 5.2.0 for IOS XRv. I was about to hit the hay last night when I saw the location appearing on the Twitter courtesy of @ciscoiosxr and my tiredness quickly left me. I have been waiting for something on SR for what seems like an eternity, imagine my sadness when the files started downloading at 200K, Santa was going to make me wait. Finally the vmdk downloaded and I was ready to go.

In this post I will show you a brief outline of how to configure SR in ISIS and some basic reachability. It’s pretty straightforward to get a base network up and running but unfortunately there is not a lot of documentation on it. Have a look here for some info on CRS 5.2.x which mentions it. Also for reference have a read of the draft here.

So, what does the topology look like?

base sr top

Each router is running ISIS (process is called SR), there is no explicit MPLS configuration required in the traditional sense (mpls ldp, rsvp etc). XR1 and XR5 are PEs and have a VPNv4 session between each other. The customer VRF has one loopback defined on each PE which is redistributed in to VRF to prove the concept.

The only place we need to make changes from typical routing configuration is under the ISIS process:

RP/0/0/CPU0:XR3#sho run router isis SR
Wed Jul 9 13:12:59.112 UTC
router isis SR
is-type level-2-only
net 49.0001.0000.0000.0003.00
address-family ipv4 unicast
metric-style wide
segment-routing mpls
!
interface Loopback0
address-family ipv4 unicast
prefix-sid index 20003
!
===snipiddy snip, nothing special here===

The only changes we had to make were to add in segment-routing mpls under the IPv4 unicast AF and set a prefix segment ID on the loopback address. Now from my reading of the draft the loopback should be a Node SID but that command doesn't seem to be available in this release. The number you choose for your SID will be altered by the OS and fit in to the actual range in use which is platform dependent. Here we support over 1 million labels so the configured prefix SID is added to 900000 with a max value of 65535 (more on that later). I chose 2000x where x is XRx from the node name. As an aside you cannot configure the prefix-sid index on a physical interface:

RP/0/0/CPU0:XR1(config)#router isis SR
RP/0/0/CPU0:XR1(config-isis)#int g0/0/0/0
RP/0/0/CPU0:XR1(config-isis-if)#add ipv4
RP/0/0/CPU0:XR1(config-isis-if-af)#prefix index 30001
RP/0/0/CPU0:XR1(config-isis-if-af)#commit
Wed Jul 9 14:31:37.859 UTC

% Failed to commit one or more configuration items during a pseudo-atomic 
operation. All changes made have been reverted. Please issue 'show 
configuration failed [inheritance]' from this session to view the errors

Now let's have a look at the network and see what we can see. Again documentation isn't readily available so bear with me here...

We can see XR1 is sending us one VPNv4 prefix, which is 10.0.0.1/32, the loopback10 address from VRF CUST over yonder the network:

RP/0/0/CPU0:XR5#sho bgp vpnv4 uni summ | i 1.1.1.1
Wed Jul 9 13:29:31.354 UTC
1.1.1.1 0 65000 62 62 7 0 0 00:58:53 1
RP/0/0/CPU0:XR5#sho bgp vpnv4 uni vrf CUST 10.0.0.1/32 | b 1.1.1.1
Wed Jul 9 13:31:09.727 UTC
1.1.1.1 (metric 40) from 1.1.1.1 (1.1.1.1)
Received Label 16000
Origin incomplete, metric 0, localpref 100, valid, internal, best, group-best, 
import-candidate, imported
Received Path ID 0, Local Path ID 1, version 6
Extended community: RT:1:1
Source VRF: CUST, Source Route Distinguisher: 1:1
RP/0/0/CPU0:XR5#

And just to be sure this is what XR1 is sending

RP/0/0/CPU0:XR1#sho bgp vpnv4 uni labels | i 1/32
Wed Jul 9 13:36:07.657 UTC
10.0.0.1/32 0.0.0.0 nolabel 16000

We are receiving our label from XR1 of 16000, standard XR fare there. How about the MPLS forwarding table? Lets look from XR5 to XR1 via XR3 and XR2 (path via XR4 is costed out)

RP/0/0/CPU0:XR5#sho mpls for labels 920001
Wed Jul 9 13:38:23.908 UTC
Local Outgoing Prefix Outgoing Next Hop Bytes
Label Label or ID Interface Switched
------ ----------- ------------------ ------------ --------------- ------------
920001 920001 No ID Gi0/0/0/0 10.3.5.3 4585 

RP/0/0/CPU0:XR3#sho mpls for labels 920001
Wed Jul 9 13:37:47.240 UTC
Local Outgoing Prefix Outgoing Next Hop Bytes
Label Label or ID Interface Switched
------ ----------- ------------------ ------------ --------------- ------------
920001 920001 No ID Gi0/0/0/0 10.2.3.2 5269

RP/0/0/CPU0:XR2#sho mpls for labels 920001
Wed Jul 9 13:38:47.046 UTC
Local Outgoing Prefix Outgoing Next Hop Bytes
Label Label or ID Interface Switched
------ ----------- ------------------ ------------ --------------- ------------
920001 Pop No ID Gi0/0/0/0 10.1.2.1 5868

And finally lets looks at the ISIS database on XR1

RP/0/0/CPU0:XR1#sho isis database verbose XR1.00-00 | b Seg
Wed Jul 9 13:43:26.497 UTC
Segment Routing: I:1 V:0, SRGB Base: 900000 Range: 65535
Metric: 10 IS-Extended XR1.01
Metric: 10 IS-Extended XR1.03
Metric: 10 IP-Extended 1.1.1.1/32
Prefix-SID Index: 20001, R:0 N:0 P:0
Metric: 10 IP-Extended 10.1.2.0/24
Metric: 10 IP-Extended 10.1.4.0/24

Here we can see the Prefix SID is set to 20001 and also the range supported to 65535. We're all about the link state protocol so XR5 sees:

RP/0/0/CPU0:XR5#sho isis da ver XR1.00-00 | b Seg
Wed Jul 9 13:44:42.612 UTC
Segment Routing: I:1 V:0, SRGB Base: 900000 Range: 65535
Metric: 10 IS-Extended XR1.01
Metric: 10 IS-Extended XR1.03
Metric: 10 IP-Extended 1.1.1.1/32
Prefix-SID Index: 20001, R:0 N:0 P:0
Metric: 10 IP-Extended 10.1.2.0/24
Metric: 10 IP-Extended 10.1.4.0/24

Now if we look at the CEF tables for the default VRF and the CUST VRF lets see what we have:

RP/0/0/CPU0:XR5#sho cef 1.1.1.1/32 | i label
Wed Jul 9 13:48:58.124 UTC
local label 920001 labels imposed {920001}

The BGP next hop will be using 920001 for the transport label and we know the VPN label issued by BGP over on XR1 will be 16000 so the label stack for VRF CUST on XR5 towards 10.0.0.1/32 on XR1 is...

RP/0/0/CPU0:XR5#sho ip cef vrf CUST 10.0.0.1/32 | i label
Wed Jul 9 13:51:29.064 UTC
next hop 10.3.5.3/32 Gi0/0/0/0 labels imposed {920001 16000}

Magic. Now does this work in the dataplane? We should see traffic MPLS'd via XR3 and XR2 and finally IPv4'd in the VRF at XR1

RP/0/0/CPU0:XR5#trace vrf CUST 10.0.0.1 so 10.0.0.5
Wed Jul 9 13:52:21.800 UTC

Type escape sequence to abort.
Tracing the route to 10.0.0.1

1 10.3.5.3 [MPLS: Labels 920001/16000 Exp 0] 39 msec 49 msec 29 msec
2 10.2.3.2 [MPLS: Labels 920001/16000 Exp 0] 29 msec 29 msec 39 msec
3 10.1.2.1 39 msec * 39 msec

Beautiful! As we can see the label stack stays consistent across the network as every node knows that 920001 is the label to reach 1.1.1.1/32, our BGP next hop from XR5.
In the opposite direction we should have a transport label of 920005 and the VPN label, which should be 16000 as well, coincidentally as there is only a single service and no other signalling protocols in play.

RP/0/0/CPU0:XR1#trace vrf CUST 10.0.0.5 so 10.0.0.1
Wed Jul 9 13:53:59.443 UTC

Type escape sequence to abort.
Tracing the route to 10.0.0.5

1 10.1.2.2 [MPLS: Labels 920005/16000 Exp 0] 119 msec 39 msec 39 msec
2 10.2.3.3 [MPLS: Labels 920005/16000 Exp 0] 39 msec 29 msec 29 msec
3 10.3.5.5 39 msec * 29 msec

One of the benefits of SR is being able to statically route your traffic without maintaining state associated with RSVP. I don't know if that feature is available in this release, I haven't found anything yet.

Pig iron time, can we configure a prefix SID higher than 65535? No would be the answer

RP/0/0/CPU0:XR3(config)#router isis SR
RP/0/0/CPU0:XR3(config-isis)#int lo0
RP/0/0/CPU0:XR3(config-isis-if)#add ipv4 uni
RP/0/0/CPU0:XR3(config-isis-if-af)#prefix index 65536
RP/0/0/CPU0:XR3(config-isis-if-af)#commit
Wed Jul 9 13:26:46.825 UTC

% Failed to commit one or more configuration items during a pseudo-atomic operation. All changes made have been reverted. Please issue 'show configuration failed [inheritance]' from this session to view the errors
RP/0/0/CPU0:XR3(config-isis-if-af)#prefix index 65535
RP/0/0/CPU0:XR3(config-isis-if-af)#commit
Wed Jul 9 13:26:51.565 UTC
RP/0/0/CPU0:XR3(config-isis-if-af)#do sho mpls for | i 655
Wed Jul 9 13:27:03.434 UTC
965535 Aggregate default: Per-VRF Aggr[V] \

I hope to write more on SR as more and more features become available, and interop stuff between vendors but for now enjoy the wonder that is the IGP delivered label.

Advertisements

End of an era and new stuff

May 12, 2014 4 comments

I have been with my current employer for almost 15 years, that’s pretty much my entire adult life.  Turns out passing the CCIE opens a lot of doors so I resigned and I am off to try out the world of contracting.

How does that impact on my blog?  Well as sporadic as my entries might be I do enjoy writing and plan on keeping going.  I’ve committed to passing the SRA within the next 6 months at most so I will be posting more relevant stuff to that.  I won’t have access to a lab 24/7 and will have to buy time off mysrlab which seems very expensive but what can you do.  The sooner ALU release their vsim the better (around 12.0r4 apparently).

Anyway watch this space as I prepare for the NRS2 lab, complete the remaining 3 writtens for the SRA and then finally the SRA lab itself.

Categories: Uncategorized

RPF on the 7750

Burak recently asked for a post on RPF loose and strict modes and how they behave on the 7750.  I have quit my job so I have been frantically trying to get things finished and handed over and haven’t had time to really test anything for my own amusement.  As I will be finishing up tomorrow and won’t have access to any 7750 lab stuff this is a real quick thrown together post.

We will use a simple network of four routers.  All routers have all interfaces in OSPF area 0 with the same cost of 10 on each link.  OSPF preference (AD) on the 7750 is 10.  We configure a static route on r1-rack3 pointing 1.2.3.4/32 out to r2-rack3, its preference is 5. r1-rack3 is at 10.9.254.28

 

Drawing2

So what does the topology look like from r1-rack3s  perspective?

r1r3routetable

and what does r5-rack15 think?

r5r15 routetbale

r5-rack15 is going to send traffic on the direct path to r1-rack3 but r1-rack3 thinks 1.2.3.4 should be reachable via r2-rack3.  Let’s enable RPF on the interface and see what happens (ignore the IntraAS in the name, it’s from another test).

r1r3enable rpf

I have now enabled loose mode RPF.  Theoretically traffic should pass here as once the prefix is in the routing table it should be ok.  First clear the statistics (you need to use the urpf-stats variable to clear RPF stats or they won’t clear down)

clear stats

Now we send a ping from r5-rack15 sourced from 1.2.3.4/32.

ping from r5 good loose

As we can see the pings are successful.  This is because even though r1-rack3 uses different egress than the received packets ingress, loose allows more flexibility in what the router will accept.

r1r3 rpf loose no increment in poing

Happy days, our check fail stats have not increased.  Now let’s enable strict mode and see it all fall apart. Strict mode means you MUST receive the packet over the interface you would use to transmit to the destination.

r1r3 enable strict

Now when we ping from r5-rack15 to r1-rack3 we should not see a response to our packets arrive.

r5 pings fail

In fact debug router ip icmp doesn’t even show up failed attempts. They’re just ignored.

r1 rpf incrementing

Look at that, beautiful.  OK so it’s not a very elegant way of showing how it works but it does.  I haven’t found a debug for RPF fails or anything beyond show router interface statistics to display any further RPF information.  If you know of any stick it in the comments and I’ll add it.

Categories: ALU IGPs, ALU Multicast

NG mVPN on the 7750 – let ‘er rip

March 18, 2014 Leave a comment

In the last post in this series we saw how to configure our services on the 7750.  This post will show you what actually happens at various stages of service operation.  Let’s get started!

The I-PMSI: Once we no shut the VPRN we will cause the generation of the Intra-AS I-PMSI message to all our BGP peers.  r1-rack3 is a route reflector so it will also send that to its clients.  As you can see in the embedded text the update carries the RD and originators IP address per the RFC6514 message detail.  As we enable the VPRN on each PE the equivalent message will be transmitted and if we create more VPRNs messages will be sent for them with different RDs.

mvpn type 1

LDP will also allocate labels to the service rooted at each PE.

ldp binding vprn no shut

And finally at this stage let’s look at the LIB.  We can see the label we advertised to pe2 is our ingress label with the tunnel ID matching too.

ldp bind table

Once every PE knows about all others we have the default tree up and running, the overlay broadcast network. The customer should now be able to traffic.  We still have our receiver configured on r4-ce1-rp so now r5-ce2-src is going to transmit to the group and we simulate this by pinging the group address from r5-ce2-src

ping 224111

As we are only transmitting 100 bytes we should not trigger the S-PMSI creation at this point.  Two things will happen now, each BGP speaker will receive a source join from pe1 connected to r4-ce1-rp (rx) followed by a source active route from pe2 connected to r5-ce2-src (tx).

The source join contains the C-S address (10.5.2.5) and the C-G address (224.1.1.1).

source join yellow

The source active update is sent from the PE connected to the stream source, in our case this is pe2.  The mVPN relevant difference between the source join and here is the the ASN is not present in the source active update.

source active

If we take a look at the BGP table we can see the I-PMSI, Source-Active and Source-Join entries.

bgp mvpnroutes pe1 and pe2

The S-PMSI: OK so now we have our tree built we can go ahead and ramp up the traffic so that we see the S-PMSI updates. Let’s generate some more ICMP but increase the packet size:

r5 tx spsi

This will cause our data threshold to breach and trigger the sending of S-PMSI from pe2 and traffic to switch over the selective/data tree.  Again we see the (C-S, C-G) state highlighted in yellow.

spmsi

Along with this LDP will allocate and advertise labels for the new tree. Here we send a message to our neighbour advertising label 262129, note the tunnel ID.

ldp spmsi bind

The BGP table now has the S-PMSI entry to go along with the other three, let’s have a look.

bgp routes incl spmsi

Once traffic throttles back below the threshold or stops completely the S-PMSI A-D will be withdrawn.  As well as the BGP update we can see the LDP withdraw messages exchanged between pe1 and pe2.

spmsi withdraw cos its no longer trafficking

So that’s pretty much all I have had time to test on the 7750. If you are interested in the topic I wrote a more vendor agnostic post over on packetpushers.net which I will elaborate on further and have little mini series there too.

Please leave feedback in the comments sections or suggest something else you would like covered.

Categories: ALU Multicast

NG mVPN on the 7750 – make service rocket go now

March 14, 2014 Leave a comment

In the last post we saw a description of how to configure your core to support mLDP based mVPN services and it was admittedly straightforward.  The meat of the work happens in the service configuration which we will look at here.  I will focus on the configs and save the actual operation of the service for another post as the debug is quite long.

All the mVPN service configuration happens in the, wait for it, mvpn hierarchy.   For me one of the most critical parts of the standard is the use of BGP auto-discovery to, well, discover other PE routers in the mVPN without relying on PIM in the core.  Let’s configure that first.  We have the choice of ‘default’ or ‘mdt-safi’ and we are going to choose ‘default’ as we are not interested in the MDT SAFI.  Auto discovery is agnostic to the transport mechanism.

There is a certain order of operations to follow which we will see.  Before we can configure C signalling we need to configure the A-D method.

addef

Again more order of operations, before we configure our tunnels we have to enable C multicast signalling.  Here we need to choose between BGP and PIM for signalling between the PE routers.  As we want to get rid of PIM from our core let’s go with BGP.

cmcast

OK now the router is ready for our tunnel configuration. In a Draft Rosen VPN we would need to associate our mdt with a VPN, we don’t have to do that with NG mVPN but we do need to enable transport for our trees.  In multicast VPNs we have the concept of the default and data trees which are known as the I-PMSI and S-PMSI respectively.  The I-PMSI serves the same function as the default tree, where GRE tunnels are created in Draft Rosen at service initiation.  This allows the entire VPN to receive multicast traffic but is not efficient.  Why is it inefficient?  Any PE without interested receivers connected will still receive traffic on this tree but drop it meaning waste resources in the network.  The S-PMSI takes care of this by building a tree between interested PEs only and switching traffic over it, typically when traffic exceeds a particular data rate threshold.  To enable these we go in to provider-tunnel and no shut our transport within the I-PMSI or S-PMSI.

We have three options for transport: mLDP, RSVP-TE and PIM

inclusive

As we are using mLDP for this service we will go ahead and enable that for the inclusive and selective tunnels.  We can’t do this if we have not enabled the VRF PIM process, which we took care of in this post

incsele

I think this highlights why I am such a fan of SROS.  It is so simple to do some pretty complicated things even though you must do things in a specific order.  Time was you would have all sorts of crazy patches and elaborate nonsensical configs but this OS I well structured for the most part.  Lets not spoil the moment by thinking about QoS and triple play configs :).

If you were in the process of migrating from a PIM core to a pure MPLS one you could enable PIM as your provider tunnel here for nodes that don’t support mLDP.  Because this is done on a per service basis you could gradually migrate away from your legacy PIM based core.

Anyway we are just one short commands away from finishing our service and testing it out.  If you want you can configure your mVPN specific route targets, maybe you want to import routes at different remote PEs,  but if you don’t need to do this you can inherit them from the unicast RT

vrftaruni

Thats it!  The service should be up because of course we have configured the same thing on each of our participating PEs.  So is it up? The tension is killing me…

post 2 mvpn up

And service is up with mLDP based provider tunnels.  The next post will cover what actually happens as we enable service and traffic starts to flow

Categories: ALU Multicast

NG mVPN implementation on the 7750 – Setting up for service

March 14, 2014 5 comments

Ok so it has been a while. I have done some testing on LDP based mVPN at this stage, not a huge amount, mainly basic functionality.

The first thing to note is 7750 MUST run in chassis mode D to allow the multicast commands required for mLDP. This means no IOM1 or IOM2 cards are allowed in the chassis and because of this I have a limited topology to play with.

PE network r5 src

Both PE routers are SR12s running CPM2-400G and have IMMs or IOM3 in the CE facing slots.  The CE routers are Cisco 1841, r4-ce1-rp is the RP and also the receiver, it’s config is below.  r5-ce2-src simply runs PIM on its uplink to pe2 and will source traffic to group 224.1.1.1.

r4 config

Roll on up the 7750s.  The first task we do is create the VPRN service and place the CE facing interface in to the VRFs PIM process.  I have configured the CE facing interface name as “mvpn” just for clarity, it has no bearing on the mVPN configuration.  The remainder of the config is exactly the same as standard VPRN.  PIM configuration is very straightforward, all I do is enter the PIM process and add the interface.  The rest of the config below is defaulted.  I won’t be setting the RP address as I think that’s intrusive on the customers experience, they should have the freedom to change their RP as they see fit.

r1pimconfig

Once we do this, and assuming connectivity is good, we should see a PIM adjacency going up:

r4pimup

So that is the very basic element complete.  Now we just need to do the entire core!  Well not so much.  Per the diagram above I already have MP-BGP configured and both routers have address family mvpn-ipv4 activated.  I also have VPNv4 activated, not much point having mVPN and no IPv4 VPN to use.  Would anyone buy that service?

r1bgpaf

There is one other ‘core’  element we need to verify before we get in to the VRF specific multicast configuration.  We need to ensure our network interface will support multicast and the creation of mLDP trees. We simply enable it under the LDP interface…

re1multitrenable

and if we want to prevent multicast over an interface…

r1multidis

I have links to routers that do not run in mode D so multicast processing should be disabled.

OK so with MP-BGP and LDP multicast enabled we can configure the VPRN to carry the customers traffic.  That will come in the next post which will see the VPRN config and debug behaviour when the services comes up and how BGP updates trigger switchover to the S-PMSI.

Categories: ALU Multicast

Multicast of the 7750 – IGMP static joins

January 25, 2014 Leave a comment

Here we will configure IGMP and associate an interface with it. We will also create a static join which is useful for troubleshooting but I guess on a platform like the 7750 you would use it for broadcast TV.

Let’s create an interface first, I’ll call it igmp but the name is irrelevant to the IGMP process.

config router interface igmp
address 3.3.3.3/24
port 1/1/2:3

Now lets start IGMP and associate the interface with the process

config router igmp interface "igmp"

Unshut the process if not already done
So now the interface is there we need to associate the static mapping to it. We wil use group 239.3.3.3. Once we do this we will see a warning message

*A:R3>config>router>igmp>if# static group 239.3.3.3
WARNING: CLI The static group is not yet created because source or starg is not yet specified.

What this means is we need to either set the static source or configure it as ASM, the starg keyword meaning (*,g)
so all we need to do is lash in the keyword and we should see the shared tree back to the RP

starg

And we do, the link to R3 is in the OIL

*A:R4# show router pim group 239.3.3.3 detail
.
===============================================================================
PIM Source Group ipv4
===============================================================================
Group Address : 239.3.3.3
Source Address : *
RP Address : 44.44.44.44
Advt Router : 44.44.44.44
Flags : Type : (*,G)
MRIB Next Hop :
MRIB Src Flags : self Keepalive Timer : Not Running
Up Time : 0d 00:09:51 Resolved By : rtable-u
.
Up JP State : Joined Up JP Expiry : 0d 00:00:08
Up JP Rpt : Not Joined StarG Up JP Rpt Override : 0d 00:00:00
.
Rpf Neighbor :
Incoming Intf :
Outgoing Intf List : R3_1/1/2
.
Curr Fwding Rate : 0.0 kbps
Forwarded Packets : 0 Discarded Packets : 0
Forwarded Octets : 0 RPF Mismatches : 0
Spt threshold : 0 kbps ECMP opt threshold : 7
Admin bandwidth : 1 kbps
-------------------------------------------------------------------------------
Groups : 1

Short and sweet

Categories: ALU Multicast