It’s Time to Stop Using the Term Network Function Virtualization (NFV)

I think it’s time to stop using the term “network function virtualization”. Why? Because it doesn’t exist, at least not in the way the term suggests. The term is a category error, and when people try to make sense of the term, confusion and frustration ensue.

Think of it like this: what’s the difference between a “virtual network function” and a “non-virtual network function”? For example, how is “virtual IP forwarding” different than “non-virtual IP forwarding?” Answer: it’s not.

So what then exactly is network function virtualization?

The Right Idea, The Wrong Term

The European Telecommunications Standards Institute, which arguably coined the term NFV, said the following in a 2012 whitepaper (emphasis mine):

Network Functions Virtualisation aims to address these problems by leveraging standard IT virtualisation technology to consolidate many network equipment types onto industry standard high volume servers

Look at the bold text. How does one consolidate many network equipment types onto commodity servers? Let’s add some specifics to make it more concrete. How does one consolidate a firewall, router, switch, and load-balancer onto a server? By implementing those network functions in software and putting that software on the server.

But here’s the problem with calling that “network function virtualization”: virtualization has nothing to do with implementing network functions in software. In the early days of the Internet, routers (gateways as they were called back then) ran on commodity x86 machines with no virtualization (with the exception, maybe, of virtual memory).

Network functions don’t need virtualizing, and in fact, can’t be virtualized. But the term NFV suggests otherwise.

And that’s where the confusion started….

NFV is like dividing by zero: undefined

Conceptually, NFV is just implementing network functions in software. That’s easy enough to understand. And yet it’s hard to find an actual definition of it anywhere. Instead, you’ll see a lot of hand-wavy things like this:

NFV is a virtual networking concept…
NFV is a network architecture concept that uses the technologies of IT virtualization…

Hence the letters “N” and “V”. And then you have those who gave up on a definition and just went straight for the marketing lingo:

NFV is the next step…
…is the future…
…is the progression/evolution…

Others get closer by hinting at what NFV does, but stop short of actually saying what it is:

NFV consolidates multiple network functions onto industry standard equipment

This seems to be pretty close, but where’s the virtualization part come in? Let’s try this blurb from Angela Karl at TechGenix:

[NFV lets] service providers and operators… abstract network services, including things such as load balancing, into a software that can run on basic server.

Bingo. NFV is not virtualizaton at all. It’s an abstraction of network functions!

NFV is Abstraction, not Virtualization

Before you accuse me of splitting hairs, let me explain the distinction between virtualization and abstraction. Put simply, virtualization is an imitation, while abstraction is a disguise.

Virtualization is an imitation

When you virtualize something, you’re creating an imitation of the thing you’re virtualizing.

For example, when you create a virtual disk in your favorite hypervisor, you’re hiding the characteristics of the underlying storage (disk geometry, partition info, formatting, interface, etc.). But in the same motion, you give the virtual disk the same types of characteristics: disk geometry, partition info, formatting, interface, and so on. To put it in programming lingo, the properties are the same, but the values are different.

Virtualization preserves the underlying properties and doesn’t add any property that’s not already there. Have you ever pinged a virtual disk? Probably not, because virtual disks, like real disks, don’t have network stacks.

Virtualization also preserves the behavior of the thing being virtualized. That’s why you can “shut down” and “power off” virtual machines and “format” and “repartition” virtual disks.

Now try fitting NFV into this definition of virtualization. How do you “virtually route” or “virtually block” a packet? It’s a category error.

Abstraction is a disguise

When you create an abstraction, you’re creating a disguise. Unlike virtualization, with abstraction you’re changing some of the properties of the thing you’re abstracting. You’re taking something and dressing it up to look and act completely different.

Swap space is a good example of an abstraction. It’s data on storage that looks and acts like random access memory (but way slower). Before the days of SSDs, swap was stored on spinning disks which were read and written sequentially. This is completely different than memory which can be read and written randomly. Swap space is a file (Windows) or partition (Linux) disguised as RAM.

The Case for Abstracting Network Functions

Let’s bring this around to networking. What’s it mean to abstract network functions like IP routing and traffic filtering? More importantly, why would you want to? Why not just use virtual routers, switches, and firewalls?

Simply put, virtualized network devices don’t scale. The reasons for this are too numerous to list here, but suffice it to say that TCP/IP and Ethernet networks have a lot of built-in waste and aren’t the most efficient. This is why cloud providers do network function abstraction to an extreme. It’s utterly necessary. Let’s take Amazon AWS as an example.

In AWS, an instance has a virtual network interface. But what’s that virtual network interface connected to? A virtual switch? Nope. Virtual router? Try again. A virtual firewall. Negative. Virtual routers, switches, and firewalls don’t exist on the AWS platform. So the question remains: what’s that virtual NIC connected to?

The answer: nothing. The word “connected” here is a virtual concept borrowed from the real world. You “connect” NICs to switches. In your favorite hypervisor, you “connect” a vNIC to a vSwitch.

But there are no virtual switches or routers in this cloud. They’ve been abstracted into network functions. AWS presents this as if you’re connecting a virtual interface to a “subnet” rather than a router. That’s because AWS has abstracted IP routing away from you, leaving you with nothing to “connect” to. After all, we’re dealing with data. Not devices. Not even virtual devices. So what happens? The virtual NIC passes its traffic to some software that performs network functions. This software does a number of things:

  • Switching – It looks at the Ethernet frame and checks the destination MAC address. If the frame contains an ARP request seeking the default gateway, it replies.
  • Traffic Filtering – If it’s a unicast for the default gateway, it looks at the IP header and checks the destination against the security group rules, NACLs, and routing rules.
  • Routing – If it needs to forward the packet, it forwards it (although forwarding may simply consist of passing it off to another function.)

This is a massive oversimplification, of course, but you get the idea. There’s no reason to “virtualize” anything here because all you’re doing is manipulating bits!

Overvirtualizing the Network

It’s possible to over-virtualize. To give an analogy, suppose you wanted to write a calculator application (let’s call it a virtual calculator). You’d draw a little box with numbers and operators, and let the user click the buttons to perform a calculation. Now imagine that you also decided to write a “virtual hand” application that virtually pressed buttons on the virtual calculator. That would be ridiculous, but that’s essentially what happens when you connect two virtual network devices together.

There an especially great temptation to do this in the cloud. Folks may spin up virtual firewalls, cluster them together, connect them to virtual load-balancers, IDSes, and whatnot. That’s not bad or technically wrong, but in many cases it’s just unnecessary. All of those network functions can be performed in software, without the additional complexity of virtual NICs connecting to this and that.

The Difference Between a Virtual Network Device and a Network Function

When it comes to the cloud, it’s not always clear what you’re looking at. Here are some questions I ask to figure out whether a thing in the cloud is a virtual device or just a abstracted network function:

Is there an obvious real world analog?

There’s a continuum here. An instance has a clear real world analog: a virtual machine. An Internet gateway sounds awfully like the router your ISP puts at your site, but “connecting” to it is a bit hand-wavy. You don’t get a next-hop IP or interface. Instead, your next hop is igw- followed by some gibberish. That smacks of an abstraction to me.

Can you view the MAC address table or create bogus ARP entries?

If  you can, it’s a virtual device (maybe just a Linux VM). If not, it’s likely some voodoo done in software.

Can you blackhole routes?

In AWS you can create blackhole routes, although people usually do it by accident. You can create a route with an internet gateway as a next hop, then delete the gateway. But can you create a route pointing to null0? If not, you have an abstraction, not a virtual device.

Does the TTL get decremented at each hop?

A TTL in an overlay can get decremented based on the hops in the underlay. But what I’m talking about here is not decrementing the TTL when you normally would. AWS doesn’t decrement the TTL at each hop. If you were to get into a routing loop, you’d have a nasty problem. Hence, AWS doesn’t allow transitive routing through its VPCs. So if your TTLs don’t go down at each hop, as with AWS, you’re probably dealing with an abstraction.

 

Installing the VMware ESXi Embedded Host Client

As most everyone knows, the old VMware vSphere C# client has been on its way out for years. One of the things keeping it alive is the fact that not everyone has a vCenter Server, and even those who do don’t necessarily use the Web Client. Sadly, there are some really cool features the old Windows client can’t touch, such as exposing hardware-assisted virtualization to individual VMs.

If you have a home lab and don’t need vCenter, thee ESXi Embedded Host Client gives you web-based access to these hidden features of your standalone ESXi host without having to spin up a real vCenter server.

Here’s how to install it:

  1. Shut down all VMs and place the host in maintenance mode
  2. SSH into ESXi and execute the following
    [root@esxi:~] esxcli software vib install -v http://download3.vmware.com/software/vmw-tools/esxui/esxui-signed-4393350.vib
  3. Browse to https://[ESXi]/ui
    You should see the login screen:
    VMware ESXi Embedded Host Client Login Screen
  4. Log in using whatever credentials you use in the old C# vSphere client. You should see something that looks an awful lot like the vSphere Web Client:
    VMware ESXi Embedded Host Client Initial Screen

Using IRQbalance to Improve Network Throughput in XenServer

If you are running XenServer 5.6 FP1 or later, there is a little trick you can use to improve network throughput on the host.

By default, XenServer uses the netback process to process network traffic, and each host is limited to four instances of netback, with one instance running on each of dom0’s vCPUs. When a VM starts, each of its VIFs (Virtual InterFaces) is assigned to a netback instance in a round-robin fashion. While this results in a pretty even distribution of VIFs-to-netback processes, it is extremely inefficient during times of high network load because the CPU is not being fully utilized.

For example, suppose you have four VMs on a host, with each VM having one VIF each. VM1 is assigned to netback instance 0 which is tied to vCPU0, VM2 is assigned to netback instance 1 which is tied to vCPU1, and so on. Now suppose VM1 experiences a very high network load. Netback instance 1 is tasked with handling all of VM1’s traffic, and vCPU0 is the only vCPU doing work for netback instance 1. That means the other three vCPUs are sitting idle, while vCPU0 does all the work.

You can see this phenomenon for yourself by doing a cat /proc/interrupts from dom0’s console. You’ll see something similar to this:


(The screenshot doesn’t show it, but the first column of highlighted numbers is CPU0, the second is CPU1, and so on. The numbers represent the quantity of interrupt requests.)

If you’ve ever troubleshot obscure networking configurations in the physical world, you’ve probably run into a router or firewall whose CPU was being asked to do so much that it was causing a network slowdown. Fortunately in this case, we don’t have to make any major configuration changes or buy new hardware to fix the problem.

All we need to do to increase efficiency in this scenario is to evenly distribute the VIFs’ workloads across all available CPUs. We could manually do this at the bash prompt, or we could just download and install irqbalance.

irqbalance is a linux daemon that automatically distributes interrupts across all available CPUs and cores. To install it, issue the following command at the dom0 bash prompt:

yum install irqbalance --enablerepo base

You can either restart the host or manually start the service/daemon by issuing:

service irqbalance start

Now restart your VMs and do another cat /proc/interrupts. This time you should see something like this:

That’s much better! Try this out on your test XenServer host(s) first and see if you can tell a difference. Citrix has a whitepaper titled Achieving a fair distribution of the processing of guest network traffic over available physical CPUs (that’s a mouthful) that goes into more technical detail about netback and irqbalance.

How To Get a Unique STA ID for each of your PVS Provisioned XenApp Servers

Citrix Provisioning Services is very nice, but it does come with a slightly annoying quirk: All of your provisioned XenApp servers end up with the same STA ID! This will cause all sorts of problems for Citrix Access Gateway, Citrix Receiver, and anything else that may depend on having unique STA IDs. The good news is that fixing this little problem is easier than you might think.

To resolve the duplicate STA ID issue, we’ll do the following:
1. Create personality strings in Provisioning Services for each XenApp server
2. Put a PowerShell script on our golden image
3. Create a startup task to execute the PowerShell script

Let’s begin:

The format of the STA ID is simple. It is “STA” followed by the MAC address of the XenApp server’s NIC. The STA ID can really be anything beginning with “STA”, so you could get creative and have “STABLEFLY”, “STANK”, “STALE”, “STARTBUTTON”, “STACKOVERFLOW”.. and the list goes on. But I recommend sticking with the MAC address because it’s unique (usually), and is easy to match up.

1. In Provisioning Services, create a personality string for each server with the Name “UID” and the String “STA001122DDEEFF”, substituting the MAC address of the server for the hex I just threw in there.
Define Personality Strings

2. Copy this PowerShell script to your scripts location on your golden image: (Note: Download the file from the above link and do not copy and paste the text below, otherwise PS will complain.)

# STA Replacement Script for Citrix Provisioned Servers
# Created 7-25-11 by Ben Piper (email: ben@benpiper.com, web: http://benpiper.com )
# Get UID string
# Replace STA ID
# Restart CTXHTTP service

$stastr = get-content C:\Personality.ini | Select-String -Pattern “UID=STA”
$CtxStaConfig = Get-Content ‘C:\Program Files (x86)\Citrix\system32\CtxSta.config’ | ForEach-Object {$_ -replace ‘^UID=.+$’, $stastr}
$CtxStaConfig | Set-Content ‘C:\Program Files (x86)\Citrix\system32\CtxSta.config’
Stop-Service CtxHTTP
Start-Service CtxHTTP

3. Create a scheduled task to execute the PowerShell script at startup. Make sure the account that will be executing the script has appropriate permissions.

If you are not already running PowerShell scripts, you’ll need to set the ExecutionPolicy on your gold image to Unrestricted by issuing the cmdlet ” Set-ExecutionPolicy Unrestricted” at a PS prompt.

I recommend testing the script first on your Master Target server before deploying it farm-wide. The script will still work as long as you have a personality string defined for your Master Target server, even if the vDisk is in Private mode.