It occurs to me that I’ve been writing the last few posts about network management tasks based on an ITSM model and I didn’t even introduce what is probably the more arguably more useful model for breaking down and understanding network management tasks; the FCAPS model.
FCAPS has it’s roots in the ISO, similar to another model we all know and love; the OSI model. Everyone remember that one? Please Don’t Take Sales’ Peoples Advice? You may have learned another acronym for it, but this is the probably the most basic conceptual model that every networking person uses to understand the world we live in.

For those of you who are looking for some extra credit reading, or need a cure for insomnia, you can find the actual FCAPS standards in the ITU-T M.3400 recommendations. For the rest, I’m hoping to give a brief overview to help you understand the different aspects of the disciplines of network management.

F is for Fault

This involves the detection, isolation, and correction of a fault condition. Or in plain english, this lets you know when things are broken.

Fault Management could involve things like syslog, SNMP traps been escalated to Alarms. Root-Cause-Analysis and Alarm suppression or some AI which tries to seperate the signal from the noise during event storms. Alarm notification policies ( sending out an e-mail once you get an alarm ).

Traditionally this was implemented in a lot of NMSs as Green-is-good management. Basically, if everything is green. Things are ok. If they are yellow or red, you’ve probably got along night ahead of you.

In recent years, Fault Management has started to include application performance management as well. In modern networks, it’s not enough to know that an application is “up”. Now we must also make sure that the level of service, or SLA, that is been delivered to the end-user is adequate to meet their needs.

Note: Whether an activity falls into one category of FCAPS or another might depend on your perspective. If you are measuring bandwidth on a particular port, you may be in the “P”, but if you are measuring the bandwidth and raising an alarm if you cross a certain threshold, you’re now in the “F”.

This may seem confusing at first, but remember that FCAPS is just a conceptual model. This is similar to the 7 Layer OSI model. Ask any good network person what layer MPLS falls at and they will either answer ” It depends” or potentially ” 2.5 “.

C is for Configuration

This involves the configuration of the software and hardware in the network. This includes the versions of software, the actual configurations, change management, etc…

This is probably the easiest to understand. If you’re upgrading code on a switch or router, if you’re logging into a router to make a configuration change, or if you’re just plugging a network cable in to a PC, you’re in the “C”s.

Accounting

This involves the identification of cost to the service provider and payment due for the customer. Ie: Billing.

Personally, I find this definition a little restrictive and prefer to apply the definition that I heard in a presentation. I wish I could remember the name of the gentleman to give him credit. He started out in a thick southern drawl

The thing to remember about a’counting, is that the rest of the world just calls it counting.

I know. Barely funny, right?

But it does allow us to use this to include things like

netflow for counting the different protocols running across a certain WAN link.
SNMP polling of T1/PRI interfaces for ensuring that you’re Erlang calculations are accurate and you don’t need to raise or lower the number of trunks on your voice gateways.
RADIUS to track how long a user was logged into a specific port on the network or how much bandwidth he actually used.

You get the picture. Basically, accounting is just counting things which might be interesting to you.

Although this is not the strict definition from the ITU M.3400, this amended version makes it easier for me to apply this because I don’t have very many customer (read: any) who actually do charge-backs for their services.

Obviously, in a XaaS service, this domain is probably going to get a lot of attention in the coming years.

P is for Performance

This involves evaluating and reporting on the effectiveness of the network, and individual network devices.

Way back when I did my CCNA, one of the things I remember reading about was how you should be checking your routers and switches often to see if their CPU or memory was running high. I’ve never actually met anyone who logged into a device to check on a daily basis, but the advice was actually really good.

With a good NMS, you can

use SNMP polling for the CPU and Memory to track their trending over time.
use ICMP to track availability of the devices ( assuming it responds!)
use ICMP to track the latency of the device to test the quality of the link.

As I mentioned in the Fault section, performance often blurs with fault in that good performance management habits can alert you to faults in the network. In fact, good performance management can even allow you to proactively avoid faults by identifying a potential performance block in the network, and addressing the issue before it turns into a fault.

Probably the most important thing to know about performance management is that it helps you make better decisions.

Most good network engineers can instinctively know where the bottlenecks are in their networks and can usually correctly identify what needs to be upgraded to get the most benefit.

Most great network engineers can use the pretty graphs from a good performance management tool to get the money from their CFO for those upgrades.

In my home network, I actually track the response time of all my links, as well as additional services, such as the one below which allows me to keep my wife happy.

Facebook Response Time Performance Tracking

note: probably the most recognizable performance management tool would be MRTG/PRTG. I can’t even imagine how many network upgrades were justfied by the pretty graphs that came out of these tools.

Security

Security is… well security. These are the network management activites that involve securing the network and the data running over it.

In a lot of ways, I strongly believe that security should be addressed in every waking (and sleeping!) moment that you’re thinking about your networks. Security should become so second nature to us that it should be almost impossible to perform any of the other tasks without security entering the conversation.

What do I mean?

Fault – CIA – Confidentiality, Availability, and Integrity. Hard to be secure when it’s not available and the Fault domain helps us keep it that way!

Configuration – Auditing – Good configuration management practices can involve automated IT Control objective verification tools, otherwise known as “scripts” which will allow us to have the NMS ensure all the configurations are what they should be and no unneeded services are on our routers and switches.

Performance – You can’t get performance data without SNMP, and if you’re using SNMP, PLEASE USE SNMPv3 if possible! It can be encrypted with integrity. Also, lock down your management interfaces with ACLs on your devices.
FCAPS

It’s just a model

Please don’t take it too seriously. It’s not a binary model. Feel free to apply some fuzzy logic here and be confident that it’s 46% Fault Management and 54% Performance Management.

The important thing here is that it helps us understand the network management world we live in. It gives us a conceptual model to be able to understand the different activities involved in network management. As an added bonus, it also gives us a handy tool to evaluate different NMS software packages.

Think about the tools you’re using. Are you using a point solution, like Solarwinds Orion NPM which focuses on Performance monitoring, or an Open Source tool like RANCID which focuses on Configuration?

Or are you looking at a SPOG solution like HP’s IMC which provides full FCAPS (and more!) in the base package?

What tools are you using? Are they full FCAPS?Or are they more focused on one particular area?

2 thoughts on “FCAPS – A Quick Introduction”

Lindsay Hill (@northlandboy) says:

August 23, 2012 at 3:59 am

I like the FCAPS model. It’s reasonably easy to go through the options, and work out what coverage you need. It’s more practical than jumping straight to ITIL models too, when you’re focusing on network management.

I’ve been doing some investigation into tools recently, and it seems that the F part is covered reasonably well by everyone, P is OK, but it’s the C part that is either missing, or poorly done. Too many tool suites are quite disparate. NNMi is making moves now (finally) to better integrate the monitoring, performance and configuration aspects, but they’re still fundamentally three separate tools, glued together. Orion has better integrated NCM/NPM recently.

I’m of the opinion that everyone should be looking for integrated toolsets, but I’m also finding that most places still need to retain point solutions for specific areas – e.g. because they’ve got some weird network item that doesn’t work well with the rest of the tools.

I’m also finding a lot of places don’t really do the C part at all. I think this is a big mistake, but the costs are very high for some products, or people are still scarred by CiscoWorks experience from several years ago.

I’m also seeing that the places that do use some tools for Config mgmt are only doing the basics – backup/restore, and diff reporting. I’m not seeing a lot of people trusting their NCCM tool to actually go out and make day to day changes. Maybe it gets used for bulk SNMP ACL updates, or something simple, but not for typical provisioning. Most engineers just jump back to PuTTY for that.

- netmanchris says:
  
  August 24, 2012 at 1:44 am
  
  It’s either nice, or scary, to know that we’re seeing the same thing on opposite sides of the world. 🙂
  
  I’m glad to here that Solarwinds Orion is getting better. I took the SCP based on Orion 9 and I wasn’t overly impressed with the integration between NPM and NCM at the time. Seemed more like a common GUI than anything which was actually integrated from a programatic/federated application level.
  
  I totally agree on the point solutions. The major one is firewalls that I see, and I’m not sure that this will ever be the case because of the very specific management requirements. UC&C ( VoIP ) is the other one which I think will retain their own management console for awhile yet. I think they should still probably roll up alarms and config backups functions to the “main” NMS, consolidate as much as you can in a single place and use context sensitive launch to move into the other apps as necessary.
  
  I’m reading a book right now on maturity models and the author puts forth that most current networks ( book is circa. 2007 ) are performing ad-hoc management at best. I’m rolling around a blog post in my head to try to articulate this, but I think that all of those keyboard jockeys are going to have to learn some new tricks or fall into irrelevancy. I think there’s a very good chance that SDN will just become network management in a few years. I’m hoping that this will force people into much better network management practices.
  
  @netmanchris

FCAPS – A Quick Introduction