Ir maybe we had to create the full 4094 VLANs available
Many times when I’m speaking with customers, one of the first questions I get asked is
” Ok, I’ve got this NMS, what’s the first thing I should do that’s going to make the biggest difference in my network?”
There are probably a lot of opinions on the answer to this question. For me, the answer is always this:
Start with Configuration Management.
In ITILv3, one of main aspects of the configuration management domain is to track all of the configuration items that relate to an IT service. For more on ITILv3 CI’s check out this video.
For those of you who suffer from insomnia and would like a cure, most of the ITILv3 change management stuff is found in Volume III, Service Transition. In ITILv3, the first thing you need to do is to define your CMS.
Configuration Management System
This is the ITIL term for the software that handles your configs for you.
Again, remember that ITIL is about process. So it’s possible to actually run an ITIL based shop without tools in place. It’s POSSIBLE… but I think this falls in the JBYCDMYS (Just because you can doesn’t mean you should) bucket.
What to look for in your CMS
So for NMS newbie’s who are trying to get into more process driven network operations, your CMS is the software that does basic tasks like
Backup of Configurations
Any NCCM solution should allow you to backup configurations. If you’re lucky you’re NMS may have additional features that allow you to move beyond basic configuration backups. Ideally, your NMS will have features that will enable you to define configuration baselines and snapshots for any given device.
Configuration Baselines : A configuration baseline is the configuration of a service, product or infrastructure that has been formally reviewed and agreed on, that thereafter can be changed only through formal change procedures. Configuration Snapshots: A snapshot of the current state of a configuration item or an environment. It also serves as a fixed historical record.
In plain english terms, a configuration baseline is the place where you absolutely last know that everything was working. A snapshot is an automatic backup that lets you know what the state of the device was at the time of that backup.
We’ll come back to this later on a subsequent blog post, but snapshots are also great to have around for helping to address your compliance initiatives like SOX, PCI, or HIPPA. Having a configuration snapshot from a certain date is an easy way for you to prove to the auditors what the configuration state of a given device was on that date.
Configuration Templates: A complete, or a portion, of a device configuration.
This could be your standard configuration for your access switches, a secure configuration for your routers, or even just a portion of a configuration, such as the config required to change the local admin password on all your switches.
Scheduling Configuration Changes: The ability to schedule changes to your network devices at specific time.
The ability to schedule changes is nice. Assuming your changes have gone through a peer-review process and through your companies Change Approval Board, Why do you need to be up at 3am during your companies change window?
Now there may be cases where you will still need to be onsite to verify that a critical change went through. To perform the change validation tests that I KNOW you all had in your change plan. Right?
But for those cases where you are simply changing a local admin password, or adding an NTP server, or some other low-risk change, you may want to just schedule this for the ‘wee hours of the morning while you are home in your toasty bed.
One last thing…
When making major, or minor changes to your network configurations, it’s a good practice to go back and update your CMS to reflect the new Configuration Baseline for that device. You did actually run through a series of test to make sure you didn’t break something, right?
So although this could be a TFTP server on the network somewhere, hopefully it’s a software that will automate the backup of network device configurations for you. Examples could include HP’s Intelligent Management Center, Solarwinds Orion, Cisco Prime, or perhaps an opensource tool like RANCID.
In this video, I’ll go through the basic CMS functions of HP’s IMC to show how baselining and snapshots can be applied.
The DML is really nothing more than a software library. Ideally, this should be tied directly into your element management system so that you can define the baseline software image, deploy the image out to the appropriate devices, and audit the network to ensure that all of the devices are inline with your golden software definitions.
As I laid out in the last post, standardization is there to make your lives easier. But it takes a lot of commitment, especially if your network has gone through significant “organic growth”. Making the choice to commit to good configuration management hygene is sort of like committing to going to the gym or commiting to eat healthier.
Just like going to the gym, the first thing you need to do is figure out your current software state. Hopefully, your NMS software will have the ability to discover and audit the software running on the devices in your network and report against a known good state.
Audit the Current State of the Network
If you don’t have an NCCM tool in place with these features, you may end up writing scripts, or worse case, loging into your devices manually and noting the software version in an excel spreadsheet. Once you have a handle on what’s out there, the next step is chosing what version of code you need to be running.
Choosing your Software Version
So now that you’ve figured out that your devices are all over the place, it’s time to figure out what version of software you actually want to be running. Whether you are running Comware, IOS, NXOS, Junos, FTOS, or some other OS that I haven’t mentioned, the guidelines are pretty much the same.
Wash, Rinse and Repeat.
What about the exceptions?
I was going to try to sugar coat this, but I’ll just come out and say it. Cisco has licensing for many of their platforms, this can create situations where you can’t actually get on a common code version without incurring additional CAPEX costs associated with buying the licenses and OPEX to deal with the SMARTNet’. Or potentially, you can get into the situation where the features you’re looking for are mutually exclusive in two different IOS images for your routers. Or you’re running Cisco Callmanager and your gateways require the Voice image and your regular WAN routers another image.
In any event, my recommendation is still the same. Find the fewest possible combinations of software for the hardware platforms in your network and stick to them unless there is a REALLY good reason to change.
Check out this video of the basic NCCM features in HP’s Intelligent Management Center to help you navigate through your software baseline woes.
Anything I missed here? Feel free to post in the comments below.