Get a Free Hardcopy of “Learn Cisco Network Administration”

For the rest of this month, I’m giving away 10 free hardcopies of my book Learn Cisco Network Administration in a Month of Lunches. Even if you already own the book, you can pick up a free extra copy to give to a friend, coworker, or just to leave around the office.

There are two ways you can get your copy:

If you’ve already read the book:

  1. Click here to leave a review on Amazon.
  2. Once your review goes live, send me an email and let me know. Remember to include where you want me to ship the book.

If you haven’t read it:

  1. Click here to sign up for my newsletter. If you’re already signed up, proceed to step 2.
  2. Send me an email and let me know you’ve already signed up. Remember to include your name and mailing address so I know where to send the book.

Looking for something more advanced?

If you’re already a Cisco CCNA-level professional and are ready to go to the next level, check out the CCNP Routing & Switching Learning Path.

It’s Time to Stop Using the Term Network Function Virtualization (NFV)

I think it’s time to stop using the term “network function virtualization”. Why? Because it doesn’t exist, at least not in the way the term suggests. The term is a category error, and when people try to make sense of the term, confusion and frustration ensue.

Think of it like this: what’s the difference between a “virtual network function” and a “non-virtual network function”? For example, how is “virtual IP forwarding” different than “non-virtual IP forwarding?” Answer: it’s not.

So what then exactly is network function virtualization?

The Right Idea, The Wrong Term

The European Telecommunications Standards Institute, which arguably coined the term NFV, said the following in a 2012 whitepaper (emphasis mine):

Network Functions Virtualisation aims to address these problems by leveraging standard IT virtualisation technology to consolidate many network equipment types onto industry standard high volume servers

Look at the bold text. How does one consolidate many network equipment types onto commodity servers? Let’s add some specifics to make it more concrete. How does one consolidate a firewall, router, switch, and load-balancer onto a server? By implementing those network functions in software and putting that software on the server.

But here’s the problem with calling that “network function virtualization”: virtualization has nothing to do with implementing network functions in software. In the early days of the Internet, routers (gateways as they were called back then) ran on commodity x86 machines with no virtualization (with the exception, maybe, of virtual memory).

Network functions don’t need virtualizing, and in fact, can’t be virtualized. But the term NFV suggests otherwise.

And that’s where the confusion started….

NFV is like dividing by zero: undefined

Conceptually, NFV is just implementing network functions in software. That’s easy enough to understand. And yet it’s hard to find an actual definition of it anywhere. Instead, you’ll see a lot of hand-wavy things like this:

NFV is a virtual networking concept…
NFV is a network architecture concept that uses the technologies of IT virtualization…

Hence the letters “N” and “V”. And then you have those who gave up on a definition and just went straight for the marketing lingo:

NFV is the next step…
…is the future…
…is the progression/evolution…

Others get closer by hinting at what NFV does, but stop short of actually saying what it is:

NFV consolidates multiple network functions onto industry standard equipment

This seems to be pretty close, but where’s the virtualization part come in? Let’s try this blurb from Angela Karl at TechGenix:

[NFV lets] service providers and operators… abstract network services, including things such as load balancing, into a software that can run on basic server.

Bingo. NFV is not virtualizaton at all. It’s an abstraction of network functions!

NFV is Abstraction, not Virtualization

Before you accuse me of splitting hairs, let me explain the distinction between virtualization and abstraction. Put simply, virtualization is an imitation, while abstraction is a disguise.

Virtualization is an imitation

When you virtualize something, you’re creating an imitation of the thing you’re virtualizing.

For example, when you create a virtual disk in your favorite hypervisor, you’re hiding the characteristics of the underlying storage (disk geometry, partition info, formatting, interface, etc.). But in the same motion, you give the virtual disk the same types of characteristics: disk geometry, partition info, formatting, interface, and so on. To put it in programming lingo, the properties are the same, but the values are different.

Virtualization preserves the underlying properties and doesn’t add any property that’s not already there. Have you ever pinged a virtual disk? Probably not, because virtual disks, like real disks, don’t have network stacks.

Virtualization also preserves the behavior of the thing being virtualized. That’s why you can “shut down” and “power off” virtual machines and “format” and “repartition” virtual disks.

Now try fitting NFV into this definition of virtualization. How do you “virtually route” or “virtually block” a packet? It’s a category error.

Abstraction is a disguise

When you create an abstraction, you’re creating a disguise. Unlike virtualization, with abstraction you’re changing some of the properties of the thing you’re abstracting. You’re taking something and dressing it up to look and act completely different.

Swap space is a good example of an abstraction. It’s data on storage that looks and acts like random access memory (but way slower). Before the days of SSDs, swap was stored on spinning disks which were read and written sequentially. This is completely different than memory which can be read and written randomly. Swap space is a file (Windows) or partition (Linux) disguised as RAM.

The Case for Abstracting Network Functions

Let’s bring this around to networking. What’s it mean to abstract network functions like IP routing and traffic filtering? More importantly, why would you want to? Why not just use virtual routers, switches, and firewalls?

Simply put, virtualized network devices don’t scale. The reasons for this are too numerous to list here, but suffice it to say that TCP/IP and Ethernet networks have a lot of built-in waste and aren’t the most efficient. This is why cloud providers do network function abstraction to an extreme. It’s utterly necessary. Let’s take Amazon AWS as an example.

In AWS, an instance has a virtual network interface. But what’s that virtual network interface connected to? A virtual switch? Nope. Virtual router? Try again. A virtual firewall. Negative. Virtual routers, switches, and firewalls don’t exist on the AWS platform. So the question remains: what’s that virtual NIC connected to?

The answer: nothing. The word “connected” here is a virtual concept borrowed from the real world. You “connect” NICs to switches. In your favorite hypervisor, you “connect” a vNIC to a vSwitch.

But there are no virtual switches or routers in this cloud. They’ve been abstracted into network functions. AWS presents this as if you’re connecting a virtual interface to a “subnet” rather than a router. That’s because AWS has abstracted IP routing away from you, leaving you with nothing to “connect” to. After all, we’re dealing with data. Not devices. Not even virtual devices. So what happens? The virtual NIC passes its traffic to some software that performs network functions. This software does a number of things:

  • Switching – It looks at the Ethernet frame and checks the destination MAC address. If the frame contains an ARP request seeking the default gateway, it replies.
  • Traffic Filtering – If it’s a unicast for the default gateway, it looks at the IP header and checks the destination against the security group rules, NACLs, and routing rules.
  • Routing – If it needs to forward the packet, it forwards it (although forwarding may simply consist of passing it off to another function.)

This is a massive oversimplification, of course, but you get the idea. There’s no reason to “virtualize” anything here because all you’re doing is manipulating bits!

Overvirtualizing the Network

It’s possible to over-virtualize. To give an analogy, suppose you wanted to write a calculator application (let’s call it a virtual calculator). You’d draw a little box with numbers and operators, and let the user click the buttons to perform a calculation. Now imagine that you also decided to write a “virtual hand” application that virtually pressed buttons on the virtual calculator. That would be ridiculous, but that’s essentially what happens when you connect two virtual network devices together.

There an especially great temptation to do this in the cloud. Folks may spin up virtual firewalls, cluster them together, connect them to virtual load-balancers, IDSes, and whatnot. That’s not bad or technically wrong, but in many cases it’s just unnecessary. All of those network functions can be performed in software, without the additional complexity of virtual NICs connecting to this and that.

The Difference Between a Virtual Network Device and a Network Function

When it comes to the cloud, it’s not always clear what you’re looking at. Here are some questions I ask to figure out whether a thing in the cloud is a virtual device or just a abstracted network function:

Is there an obvious real world analog?

There’s a continuum here. An instance has a clear real world analog: a virtual machine. An Internet gateway sounds awfully like the router your ISP puts at your site, but “connecting” to it is a bit hand-wavy. You don’t get a next-hop IP or interface. Instead, your next hop is igw- followed by some gibberish. That smacks of an abstraction to me.

Can you view the MAC address table or create bogus ARP entries?

If  you can, it’s a virtual device (maybe just a Linux VM). If not, it’s likely some voodoo done in software.

Can you blackhole routes?

In AWS you can create blackhole routes, although people usually do it by accident. You can create a route with an internet gateway as a next hop, then delete the gateway. But can you create a route pointing to null0? If not, you have an abstraction, not a virtual device.

Does the TTL get decremented at each hop?

A TTL in an overlay can get decremented based on the hops in the underlay. But what I’m talking about here is not decrementing the TTL when you normally would. AWS doesn’t decrement the TTL at each hop. If you were to get into a routing loop, you’d have a nasty problem. Hence, AWS doesn’t allow transitive routing through its VPCs. So if your TTLs don’t go down at each hop, as with AWS, you’re probably dealing with an abstraction.

 

Why People Haven’t Adopted IPv6 (And Why You Should Learn It Anyway)

If you haven’t learned IPv6 yet, well, you’re not the only one. In December 2016, IPv6 (as we know it today) turned 18 years old. Children who were in the womb when RFC 2460 was being drafted are now old enough to vote, get married, and purchase firearms in some states.

In honor of IPv6’s 18th birthday, allow me to share my theories on why people have been so slow to adopt it. And why you still should consider learning it.

The “Lame name” theory

IPv6 terminology makes it sound like a new version of IPv4 and it’s not. It’s a totally different protocol with a similar name. If you’re familiar with the confusion between Java and JavaScript, you know what I’m talking about. People who set out to learn IPv6 are disappointed when they find out it’s almost nothing like IPv4.

The “Let’s split DHCP in half and spread its most popular functions across two protocols” theory

DHCP for IPv4 can provide clients with IP addresses, DNS servers, default gateways, TFTP servers, and pretty much anything else. DHCPv6 doesn’t have an option for providing a default gateway. If you want to push a default gateway to clients, you have to use SLAAC.

The “all things to all people, places, animals, plants” theory

IPv4 has only a few address types that anyone actually uses. Colloquially, they’re public, private (RFC 1918 addresses like 192.168.1.1), and multicast (which includes broadcast). IPv6 has approximately one zillion different address types, including unique-local, link-local, unspecified, and global unicast. Although there are technical justifications for some of these, the plethora of address types makes no sense to anyone who doesn’t deeply understand why “layer 2” is even in the IT lexicon.

The “IPv4 apocalypse” theory

We’ve all heard the constant chicken-little talk about how we have to move to IPv6 yesterday or the internet will die. Driving this is the myth that all IPv4 addresses are gone. They’re not, and the U.S. government is sitting on tens of thousands it’s never going to use. What really happened was that in 2011, the Internet Assigned Numbers Authority (IANA) assigned the last of its available IP address space to regional internet registries (RIRs) which are responsible for doling out addresses. But the IPv4 addresses didn’t just go away. They still exist, and many of them are unused and can be reassigned.

The “NAT is a tool of the devil” theory

If you ever want to have fun, go on any IT forum and ask, “Why do we need IPv6 when we have NAT?” Actually, don’t. That would be trolling. But if you were to ask that question, you’d probably get a few responses hating on IPv4 NAT as a tool of the devil, which IPv6 will save us from… except it does NAT, too.

The “Why do I need both again?” theory

Implementing IPv6 almost always requires a multihomed (dual-stack) implementation, which people figured out about 30 years ago was a bad idea with IPv4 because it confuses everybody. IT admins translate this as, “More work for me.”

The “Because we can” theory

There are enough IPv6 addresses for every cell in your body to have its own internet. Seriously? This, like NAT, is another non-reason to adopt it. Yes, it’s cool that I can give my Uncle Milton’s ant farm its own Internet. But as far as business justification goes, nope.

Why you might want to learn IPv6 (hint: money)

Although it’s been poorly marketed, it’s still worth learning. In fact, I believe in IPv6 so strongly that I’ve created several Pluralsight courses on configuring and troubleshooting it.

Here are three big reasons to consider adding it to your set of skills:

  • It’s like a sports team. The big boys are rooting for it. I’m talking about Cisco, Juniper, ISPs, Google, et alia. They want to see it win, and they’ll pay to make it happen. You can be on the receiving end of some of those payments.
  • The confusion and complexity around IPv6 has made experts that much more valuable to companies who have already invested in new infrastructure.
  • If you know IPv4, IPv6 isn’t that hard to learn once you realize that it’s a distinct protocol and not a new version of IPv4.

For further IPv6 learning, check out my Practical Networking course.

You failed your CCNP exam. Now what?

You took one of the Cisco CCNP Routing and Switching certification exams. You went to the exam center, sat down, and started the exam. About 2 hours later, you saw the dreaded news appear on the screen:

You didn’t pass.

I’ve failed certification exams in the past, so I can relate to the facepalm-worthy feeling you get when you realize you dropped a couple of Benjamins on an exam that you just failed. I know the feeling of wanting to give up, the thoughts of thinking that this whole certification thing is stupid, and the desire to assign blame to whomever or whatever led to your failure.

Failing certification exams is a reality of any IT professional. And from what I’ve seen, sadly, not many people handle failure very well. I want to talk through this.

This isn’t meant to be a pep talk or a “you’ll do better next time” motivational speech. Neither is it meant to be an assignment of blame to you or anyone else. Rather it’s a cold, hard look at why you failed, and how you can pass next time.. or the time after that.

Why you failed

I’ve taken a lot of Cisco certification exams and read a lot of Cisco books over the years and I’ve noticed a pattern. Cisco likes to play off of common misconceptions and little known technical facts. Here’s a non-real but representative example:

Two switches are connected via an 802.1Q trunk. You delete the switched virtual interface for VLAN 1 but both switches still exchange CDP messages. What will prevent CDP messages from traversing VLAN 1 without affecting Cisco IP phones?

Select the best answer:

A. Prune VLAN1 from the trunk

B. Disable VLAN1

C. Disable CDP globally

D. Disable CDP on the trunk

E. None of these

If you’ve watched my Pluralsight course series on the CCNP SWITCH exam, you’ll recall that you can’t disable VLAN1 or prune it from a trunk. Well, you can try to prune it, but CDP messages will still pass. But do you disable CDP globally or just on the trunk interface? This is where obscure knowledge comes in. Cisco IP phones use CDP to get voice VLAN information, so disabling CDP globally is out. That leaves only two answers: disable CDP on the trunk interface or none of the above. Disabling CDP on the trunk interfaces will certainly stop the CDP messages from moving between the switches, and it won’t affect Cisco IP phones since CDP messages never leave a collision domain.

Now here’s the thing: I made that question and answer up on the fly. You have to be able to do that if you want to do well on the exam.

The exam blueprint is like The Oracle, and sometimes just as wrong

In The Matrix movies, you may remember the Oracle, a computer program that supposedly knows all. After seeing the Oracle for the first time, Neo asks Morpheus how accurate the Oracle’s “prophecies” are. Morpheus responds with something to the effect of, “Try not to think of it in terms of right and wrong. The Oracle is a guide to help you find the path.” Not surprisingly, it turned out the Oracle was kinda wrong on some stuff.

Well, the blueprint is a lot like that. It has stuff that never shows up on any exam. This is mainly because if the exam covered the entire blueprint, it would be 8 hours long. It also leaves off some topics that do appear on the exam. The lesson here is don’t depend on the exam blueprint. Make sure you know the topics for prerequisite and related exams. If you’re taking CCNP SWITCH, make sure you know the topics for ROUTE. If you’re taking TSHOOT, make sure you know ROUTE and SWITCH. Of course, make sure you know all the CCNA R&S topics upside down and backwards.

Each exam blueprint is a guide. It’s a guide to the other exam blueprints.

How to pass next time.. or the time after

Once you’ve already taken a CCNP exam, the next time you go in to take the same exam, you’re technically “brain dumping” parts of it. I’m not talking about cheating. I mean you’ve seen the exam already, and you have a feel for what the questions are like. If you’ve got lots of time and money, you can take the same exam over and over again, getting slightly better each time until you pass. I don’t recommend this strategy, not just because it’s expensive, but because it puts you in the super awkward situation of telling others how many times you took the exam. Trying until you pass is respectable, but you should have some serious expertise to show for it. If I’m interviewing you and it took you 5 tries to pass a CCNP exam, I’m going to grill you hard on the technical questions.

If you want to have a great chance of passing the next time, then study for the certification one step higher than the one you want to attain. If you’re studying for the CCNA, act like you’re studying for the CCNP. If you want the CCNP, act like you’re studying for the CCIE. Obviously the topics are different. You don’t need to study multicast in-depth for your CCNP. But for the topics that overlap, it’s better to overshoot than aim for the bare minimum.

New book! Learn Cisco Network Administration in a Month of Lunches

The pre-release of my new book, Learn Cisco Network Administration in a Month of Lunches, is available from Manning Publications’ early access program.

The book is a tutorial designed for beginners who want to learn how to administer Cisco switches and routers. Set aside a portion of your lunch hour every day for a month, and you’ll start learning practical Cisco Network administration skills faster than you ever thought possible.

Citrix Web Interface: Error occurred while making the requested connection

I recently ran into a bizarre issue with users not being able to launch applications from a very old Citrix Presentation Server 4.0 farm when trying to launch from Citrix Web Interface 5.4. They were getting the eminently unhelpful, “An error occurred while making the requested connection.”

The Diagnosis

In the web interface application logs, I noticed this:

An error of type IMA with an error ID of 0x80000003 was reported from the Citrix XML Service at address (servername)

And this:

The farm MyFarm has been configured to use launch references, but a launch reference was not received from the Citrix XML Service. Check that the farm supports launch references or disable launch reference requests.

The Solution

To resolve this, I modified C:\inetpub\wwwroot\Citrix\XenApp\conf\WebInterface.conf on the Web Interface servers and changed the RequireLaunchReference directive as follows:
RequireLaunchReference=Off
(It was set to On)

And it worked. Supposedly, that directive must be set to Off when using Web Interface 5.4 with PS 4.0. But, I’ve been running for years with it set to On and it worked fine until recently. Another Citrix mystery.

Want more Citrix tips and tricks? Watch my Citrix NetScaler course!

Net Neutrality is a Scam

One of the biggest scams of the Internet is in full swing right now. You may have heard of it. It’s called “net neutrality.”

Fundamentally, net neutrality is about preventing Internet service providers (ISPs) from throttling or blocking traffic or providing paid prioritization of certain content. In addition, specific rules proposed by the FCC Chairman Tom Wheeler would allow the FCC to arbitrate peering disputes between carriers. Traditionally, carriers have connected each other’s networks with each other for a nominal cost or none at all. The idea being that the mutual benefit of using each other’s network for transit is payment enough. The proposed FCC rules, however, will turn this once amicable transaction into a litigious battleground that could result in the destabilization of the Internet’s backbone.

I recall an article from a 1997 issue of Wired magazine which predicted the collapse of the Internet would be caused by increased growth without the infrastructure to support it. That never happened, in part due to technical innovation which kept up with growth, but also because ISPs and backbone carriers were able to throttle traffic during peak times to ensure everyone could have reasonably fast and reliable internet access.

Now, almost 20 years later, we’re looking at potential regulation that will micromanage how ISPs manage and build out their networks. As a network engineer, I understand the need to throttle or simply block certain types of traffic. But unfortunately, the technical facts have gotten lost amidst the raw politicization of the net neutrality debate. I recently saw a graphic put out by the pro-net neutrality group “Battle for the Net” that shows a picture of the United States Senate and a caption that asks, “Does your state have the Internet’s worst enemy?” It then proceeds to list all the Senators that are supposedly trying to “kill Net Neutrality.” And this is the problem with the net neutrality movement. It’s purely political and devoid of any thoughtful technical or practical discussion. Organizations like Battle for the Net don’t bother to make a case for net neutrality. They assume that it is an absolute good and that being for the Internet means being for net neutrality.

The discussion has devolved from a debate into a marketing battle plagued by word games and politics. Net neutrality advocates have adopted the language that this is “a battle for the Internet” and an effort to “keep the internet open.” Apparently, by breaking decades of precedent and giving the FCC more power to control what Internet service providers do, the Internet will somehow become better. The narrative they put forth is that the big bad cable companies with their zillions of dollars are trying to make end users’ Internet experience slow and expensive, and are fighting valiant efforts to “keep the internet free” (Nevermind the fact that the cable companies gave us broadband Internet and brought us out of the dial-up era to begin with.) This David versus Goliath theme is great for stirring emotions, but it falls flat in the face of a little bit of scrutiny. Google, whose income is more than double that of Comcast, is strongly in favor of “net neutrality” regulations. So is Netflix. And Facebook.

Regardless of where you stand on net neutrality, one thing is certain: this is not about big money corporations versus the gentle folks of the Internet. It is about giant corporations duking it out for power, control, and government favor. As usual, the politics of net neutrality has turned the debate into more of a sporting event where everyone roots for his own team no matter what. But it’s actually worse than that. If you’re against net neutrality, some will perceive you as being anti-Internet or against Internet freedom. I find this both amusing and disturbing. Amusing, because the notion that giving the FCC unprecedented regulatory power over the Internet will somehow increase freedom to be absurd. And disturbing, because so many have blindly taken sides on this debate without an understanding of its implications or what it’s even about.

One such implication is privacy. How will the FCC ensure that ISPs are complying with the new regulations and not throttling or blocking certain types of traffic? The only way to know is by looking at the traffic, which can only be done with detailed logs of what an ISP’s users are doing. This goes beyond what websites you visited or how many gigabytes you downloaded. This gets down to individual connections. What IP address and port did you connect to? What protocol were you using? Certainly, these things can be logged now, and in fact probably are. But the difference is that, as of now, the FCC has no authority to demand such logs. With net neutrality regulations in place, they will, and they will also have the power to exact fines if ISPs fail to retain logs for a certain period of time. So, you will be able to BitTorrent without restriction, but Uncle Sam is probably going to know about it. Of course, this is already happening with the NSA pretty much spying on everything. But again, the difference is that instead of spying secretly, the collection of your Internet activity will be open and shameless. That may not bother you. Honestly, it doesn’t really bother me. The point is that net neutrality regulations come with some pretty long and tangled strings attached. And it’s wise to unravel them and see where they lead before throwing in your support for the wolf in sheep’s clothing.

How to Make NetApp Use the Correct Interface for iSCSI

If you’re familiar with networking you know that when a device is directly connected to two separate IP networks, traffic destined for one of those networks should egress on the interface that is directly connected to that network. For example, if your storage appliance is directly connected to the 172.16.1.0/24 network, and you want to send a packet to a device with the IP of 172.16.1.55, traffic should egress on the interface connected to that network. Unfortunately, in the case of some NetApp filers, this does not always happen.

I ran into a peculiar issue when trying to force NetApp’s Snapmirror to replicate across a specific interface, only to be met with an ugly “Snapmirror error: cannot connect to source filer (Error: 13102)”. I confirmed with NetApp support that the Snapmirror configuration was correct for what I was trying to accomplish.

To troubleshoot, I started a packet trace on the destination filer using the command:

pktt start all -d /etc

I then kicked off the snapmirror initialization, waited for it to fail, then stopped the packet trace with

pktt stop all

Since I directed the trace files to be placed in /etc on the filer I just browsed to the hidden etc$ CIFS share on the filer and opened the traces in Wireshark. What I found was that the traffic that should have been egressing on the iSCSI VIF was actually going out on the LAN VIF. Not only that, the filer was using its iSCSI address on the LAN VIF! I’m always hesitant to label every quirk a “bug,” but this is definitely not correct behavior.

The remedy was as simple as adding a route statement similar to this:

route add inet 172.16.1.0/24 172.16.2.1 1

where 172.16.1.0/24 is the iSCSI network I want to traverse to reach the Snapmirror partner, and 172.16.2.1 is the gateway on my locally connected iSCSI network. The 1 specifies the cost metric for the route, which will always be 1 unless you need to add additional gateways.

To make the change permanent, simply add the route statement to the /etc/rd file on the filer.

Special thanks to NetApp’s Scott Owens for pointing me in the right direction on this.

Using IRQbalance to Improve Network Throughput in XenServer

If you are running XenServer 5.6 FP1 or later, there is a little trick you can use to improve network throughput on the host.

By default, XenServer uses the netback process to process network traffic, and each host is limited to four instances of netback, with one instance running on each of dom0’s vCPUs. When a VM starts, each of its VIFs (Virtual InterFaces) is assigned to a netback instance in a round-robin fashion. While this results in a pretty even distribution of VIFs-to-netback processes, it is extremely inefficient during times of high network load because the CPU is not being fully utilized.

For example, suppose you have four VMs on a host, with each VM having one VIF each. VM1 is assigned to netback instance 0 which is tied to vCPU0, VM2 is assigned to netback instance 1 which is tied to vCPU1, and so on. Now suppose VM1 experiences a very high network load. Netback instance 1 is tasked with handling all of VM1’s traffic, and vCPU0 is the only vCPU doing work for netback instance 1. That means the other three vCPUs are sitting idle, while vCPU0 does all the work.

You can see this phenomenon for yourself by doing a cat /proc/interrupts from dom0’s console. You’ll see something similar to this:


(The screenshot doesn’t show it, but the first column of highlighted numbers is CPU0, the second is CPU1, and so on. The numbers represent the quantity of interrupt requests.)

If you’ve ever troubleshot obscure networking configurations in the physical world, you’ve probably run into a router or firewall whose CPU was being asked to do so much that it was causing a network slowdown. Fortunately in this case, we don’t have to make any major configuration changes or buy new hardware to fix the problem.

All we need to do to increase efficiency in this scenario is to evenly distribute the VIFs’ workloads across all available CPUs. We could manually do this at the bash prompt, or we could just download and install irqbalance.

irqbalance is a linux daemon that automatically distributes interrupts across all available CPUs and cores. To install it, issue the following command at the dom0 bash prompt:

yum install irqbalance --enablerepo base

You can either restart the host or manually start the service/daemon by issuing:

service irqbalance start

Now restart your VMs and do another cat /proc/interrupts. This time you should see something like this:

That’s much better! Try this out on your test XenServer host(s) first and see if you can tell a difference. Citrix has a whitepaper titled Achieving a fair distribution of the processing of guest network traffic over available physical CPUs (that’s a mouthful) that goes into more technical detail about netback and irqbalance.