LISP Real Life Implementation

21 Aug 2015

LISP Real Life Implementation

I was working on the internal IT side of a regional ISP. Up until a few years ago the lay of the land was basically: Like-for-Like hardware refresh, little virtualization, point solution for everything. In other words, stuff that no modern IT should do. New management started changing things a bit and IT was looking to get their infrastructure more robust in order to start hosting apps for the customer side of the business (up until then, OSS and ISP core groups were running their own infrastructure for everything: Network (obviously), server and storage. And they relied on vendors to support all the stuff they didn't have expertise on [storage for example.])

So the point of the whole thing was to create a geographically diverse datacenter that permitted workload mobility and hopefully not create too much networking problems. Multi-tenancy was a must. The goal was also to merge both enterprise and internet-facing fabrics into a single fabric. Scale wasn't huge, about 35 vSphere hosts, 10 Oracle/Sun Server Virtualization and about a 150 random servers. But at the same time we were expecting a lot of business to come in from the other side of the business..

So we bought 4 Nexus 7009, 2232 for top of racks. DCI was done with OTV. Multi-tenancy concepts were taken care of by VDCs for large tenants and VRFs for smaller ones or sub-tenants. That said, we didn't really have a solution to attract traffic to a specific side of the datacenter where a specific VM would reside. That's where LISP comes in.

The main issue we had, simply from a hardware point of view, is that the only box we had that did LISP were the 7009s. We had 7604 routers as Core routers and Cats 6500s as WAN Agregation/Enterprise Edge as well as Distribution outside the datacenter(s), all equipped with Sup720s, so no LISP there. Not an ASR in sight, and no budget for it either.

So I had to design the whole thing around VRFs on the 7009s. I know it seems counter-intuitive to the way LISP works, but in this case it worked. Because of the requirement of segmentation, firewalls and load-balancers in the environment, I knew that encapsulating and tunneling traffic around wouldn't work anyway, so I went the LISP ESM Multihop Mobility route:

http://www.cisco.com/c/en/us/td/docs/switches/datacenter/sw/nx-os/lisp/configuration/guide/bNX-OSLISPConfigurationGuide/bNX-OSLISPConfigurationGuidechapter01000.html

Basically, what I had is a VRF in which I had the MS/MR, another VRF for the entry into the datacenter that was an xTR and another xTR in each of the tenant VRF. With Extended subnet mode, I was detecting a VM moving around the infrastructure from its default gateway. Each IP tracked by the gateway created a LISP route in the tenant VRF (a /32) and then I would redistribute LISP routes.

Just a side note on the routing: OSPF in Global/Default VRF, I sent a default to all tenant VRFs and used route leaking to pass the tenant's routes to the Global table.

So in the end, I kinda just used LISP for it's ability to detect the movement of the VM and then inject /32s in the right spot. But whenever LISP is supported in the remote sites, or even in the core, I can tunnel the traffic and optimize the forwarding that way.

Published on 21 Aug 2015 Find me on Twitter!