Automating your NMS build – Part 4 Adding Custom Views

This is part four in a series of using python and the RESTful API to automate the configuration of the HP IMC network management station.  Like with all things, it’s easier to learn something when you’re able to find a good reason to use those skills. I decided to extend my python skills by figuring out how to use python to configure my NMS using the RESTful API. 

If you’re interested, check out the other posts in this series

Creating Operators

Adding Devices

Changing Device Categories

Adding Custom Views

In this post, we’re going to use the RESTful API to programatically add a custom view, and then add devices to that custom view.  For those of you who don’t know, a custom view is simply a logical grouping of devices.  In IMC custom views also form the basis of the topology maps. In fact, a custom view and a topology map are essentially the same object internally,  The difference is just whether you chose to look at it as a list of devices in the normal interface or chose to look at it in the typical topology/visio format which we all know and might-not-love. 

Why might we want to add a custom view programatically you ask? While, the answer to that might be simply that we’re lazy and it’s easier and faster.  Or it might be that you don’t want to have to look through all the different devices, or, as in my case, that you simply want an excuse to extend your python skills.  Whatever you’re reasons are, custom views are something that just don’t get used enough in my opinion. 

Custom views are a great way to be able to zoom in on the status of a specific branch, a geographic area, maybe a logical grouping of devices that support a specific application?  It really doesn’t matter and the best part is that a single device can exist in multiple custom views at the same time, so there’s really no limit to how you put these views together. It all depends on what makes sense to you. 

 

The Code

This code is a little bit more complicated than some of the other examples we’ve looked at. We’re actually going to be using multiple functions together, but the logic should be pretty easy to follow.  I’m sure there are better ways to do this, but this seems to work for me.  I’m sure I’ll be back here in a year going “Why did I write it that way!?!?!?!?!?” but for now, I hope it’s simple enough for someone else to follow and possibly get inspired. 

The main function is really just calling the other functions which we will break down below

Step 1 – Create the Custom View

In this code, we’re going to simply use this small function that I created to gather the name of the custom view ( the view_name variable ) and then use that to create the JSON payload ( the payload variable ).  The other part of the JSON payload is the autoAddDevType variable which is hard-coded to 0.  This could be used to have the system automatically add new devices of a given type, but I”m in interested in doing the automation myself here. The last part of this code will be used as the input for another function, which is the return of the view name.  You can see in the main  function in the the view_name = create_new_view() line. 

Step 2 – Get Custom Views

For the next part, we are going to need to figure out what Id was assigned to this new view, to do this we’re going to have to go through a couple of steps. The first one is to ask the NMS to send us a list of all the known custom views. The following function will request the list, which will be returned as a JSON array, and then convert it over into a python list of dictionaries so we can work with it natively as a python object. You can see this in the main function in the  view_list = get_custom_views() line

Now that we have the view_list which is the list of all the views, we’re going to have to find the ID for the new view that we just created.  We do that by by using the get_view_id() function using the view_name as the input.  Essentially, this will look through each of the views in the view_list that was returned above and let us know when the ‘name’ value is equal to the view_name value that we captured above. Once it’s equal, we then return the ‘symbolId’ which is the internal unique numeric value assigned to this particular custom view. This is the number we’re going to use to identify the view that  that we want to add devices to.  Make sense? Now that we’ve got this number, we’re going to assign it to the object view_id for use later on. 

Note: I actually could have added the devices directly to the view in the original add_custom_view code code above, but then I’d have to write the modify function later if I ever wanted to change or add new devices to the view. I’m trying to follow the DRY ( Don’t Repeat Yourself ) advice here so I just write the modify here and I can then leverage it later without having to re-write the code.  

Step 3 – Generate the Device List

We’re not going to spend too much time on this part as I’m essentially re-using the code from the Changing Device Categories blog.  It’s pretty straight forward. I’m using some user-put to gather a list of devices that we want to add to this specific view and then capture it in the dev_list object.  What’s cool about doing it this way is that the returned list will search through all the IP addresses assigned to your devices, not just the managed address which might come up in the NMS interface where you would normally perform this step. There are ways around that as well, but that’s another blog. 

Step 4 – Add Devices to Existing Custom View

This last step is where things come together. We’re going to run the add_device_to_view(dev_list, view_id) function which is using the dev_list generated in step 3 and the view_id that we captured at the end of step 2 as the inputs.  Essentially, we’re just saying here  “ add all the devices I want to the view I just created “. 

Wrapping it Up

So this is just an example of how you can tie a few pieces of code together to help automate something that might otherwise take you a lot of manual labour. In this case, just a bunch of mouse clicks and depending on the fact that you were able to manually identify all the devices you wanted to add to a specific view.  Personally, I’d rather leave the hard work to the computers and move on to something that requires my brain.

Advertisements

Automating your NMS build – Part 3 Changing Device Categories

In the first couple of posts in this series, we created some operators, then we added some devices.   In this post we’re going to look at something a little more complicated. In this post, we’re going to link a couple of different python functions together to meet the requirement which is to change a device from one category to another.

A bit about Device Categories

For those of you haven’t used HP’s Intelligent Management Centre before, the system automatically categorizes any discovered device, usually based upon SNMP sysobjectid. What this means is that when you run an auto discovery, the majority of your infrastructure will be properly classified right out of the box.

Screen Shot 2015 07 07 at 11 11 51 PM

This works great for devices which are SNMP enabled, as well as some other devices like ESX machines ( Virtual Devices ) which use SOAP as the management protocol, but it fails pretty miserably when dealing with devices which don’t support anything more than PING.  IMC’s default behaviour is to put anything which doesn’t respond to SNMP in the Desktop Category.

IP Phones aren’t smart

I’ve had customers who decided to discover all of their expensive IP phones and suddenly found out that none of them were classified properly. The problem with IP phones is that they are usually pretty stupid devices. Low memory, weak CPUs. Most of the processing/thinking is done by the PBX. I’ve never seen an IP phone that supports SNMP.

In this case we have two choices

  1. Manually change each IP phone from the Desktop category into the Voice category one device at a time.
  2. Use the RESTful API and a bit of code to do it wihile we go have a coffee

Filtering First

So the first thing we need to do is to identify the IP phones from the rest of the devices in the Desktop category. Thankfully, this is where having a well designed network can come in REALLY handy. Most Voice networks are designed so that the IP phones are automatically put into a voice vlan. This means that all phones SHOULD be in the same layer 3 network range.

The first piece of code we need to write is a simple piece of code which will allow the user to identify which of the categories they want to filter by. Although this might sound strange, we actually need to make sure that we only grab the DESKTOP devices in a specific subnet range. Imagine if you accidentally move the router or switch for this subnet into the voice category too. Really sucks leaving when you lose your router, right?

In this piece of code, we’re simply creating a dictionary which creates the link between the categoryId, which is the number IMC internally uses to identify the defined categories and the labels we humans use to identify them.  In a nutshell, this simply prints out the available categories.  Why you ask? Because the human running this script needs to known which categories they want to filter by.

This next performs two different functions

  1. Filters all known devices by the categories listed above
  2. Filters all known devices by an IP range.

Combining the two of them we’re able to easily fine all of the devices that have been classified in the Desktop Category in the L3 subnet of the IP phones and return that as a dictionary which contains, among other things, the device IDs which we’ll need in the next step.

Putting it all together

So the last piece of code is where the magic actually happens.

First we assign the output of the filtering function above into the variable called dev_list.  Essentially, this is a list of devices which meet the search criteria of the filtering function.

In our case, this means the list will consists of all the IP phones in the specific subnet that we filtered.

From there, we use the items in dev_list as input into a for loop which changes them into the new category.  ( see how we use the same print_category function from above? )

Wrapping it up

Hopefully this is a fairly useful example of how putting a few basic python functions together can help to substantially cut down on the amount of time it takes to perform what’s really a simple task. That’s the whole point of automation right?

As I continue learning, I can already see there a bunch of ways to improve this code, but hopefully having a simple working example will help people who are a couple of steps behind me on this journey take another step forward.

If you’re ahead of me on this journey and have suggestions, please feel free to comment!

@netmanchris

Playing with Solarwinds Orion NPM – How to recover from a corrupted database

I can’t believe it’s been that long, but I recently realized that my Solardwinds SCP has actually slipped. The SCP was one of the first certifications focused on network management and, as I’m sure you can imagine, I was in there as an early adopter. The training was really good ( I still miss Josh Stevens!) and the test was one of the best tests I’ve every taken in IT. It had some REALLY evil questions on there. You know the kind… the ones that prove you either know your stuff or you don’t.  No messing around with ambiguities. Ahh… good times.  On to the present though.

Open Disclosure

I’m assuming because of the major focus of my blog is network management, I was approached by Solarwinds and offered an NFR license for a couple of their products to run in my labs. As with them, I think it’s important for my readers to understand that I work for HP and sometimes find myself in competition directly against these products. I do also find myself giving some guidance to customers who are using Solarwinds products and trying to manage their HP Networking products through the Orion console. It’s the experience of using NPM and NCM to manage HP Networking equipment that I’m going to try to focus on.  Please don’t ask me to compare products.  I told you where I get paid and you can guess what my official opinion is going to be. 🙂

Orion NPM

The first product I wanted to play with is Solarwinds NPM. Solarwinds has a great following and has been around for a lot of years. There were some things that I really didn’t like about this a few years back when I passed the SCP and it will be interesting to see how the product has improved overtime and whether my old issues have been fixed.

Specifically, I was never happy with the half-enabled web-console.  The fact that I had to bounce back and forth between the windows console and the web browser to get anything done was frustrating to say the least. I know there were a lot of improvements made in Orion 10, and I’ve heard good things about 10.5 specifically.  I downloaded 10.5 and will be upgrading to the 10.6 with hot fix 3 tonight. I’m really excited to see the improvements that Solarwinds has made in the years since I last had my hands dirty with the platform.  WIsh me luck!

Before we get started…

So this is detailing some the issues I had getting NPM up and running.  To say the least, I had some issues. ( as detailed below ).  I’ve written down the symptoms and the fixes that I went through, but to be honest, this was just a REALLY bad Windows build. Sometimes, there’s just nothing you can do when the base operating system gets corrupted right from the initial install.

 

Installing

To be honest, I had some issues getting it running. The licensing actually crashed and somehow it was assigned in the Solarwinds system, but never applied to my system. I also made the mistake of downloading the package that didn’t have SQL installed ( wasn’t clear and I didn’t read the documentation closely enough ).  On the bright side, Solarwinds support actually helped me through this one in about 24 hours. Sometimes thing happen during an install, so I can’t complain too much. Plus, I should have paid more attention when flipping through the documentation. My bad.

Unscheduled Interuption.

Ahhh… well… Sometimes things don’t go as planned.  I had an unscheduled power outage tonight and it seems something has gone wrong with my installation.

NewImage

Google didn’t come up with anything. So I’m off to follow the SQL Management studio where the SolarWindsOrion database is marked as suspect…. hmmm… that’s not good.

A couple of scooby snacks and some super-sluething later and I come up with this link

In a nutshell, it looks my SQL database has been corrupted somehow and it’s now showing up as suspect in the Microsoft SQL server management console. ( While I was banging my head against this problem, I didn’t take a good screen capture. So this is where I ask you to imagine a big yellow exclamation mark of DOOM over the SolarwindsOrion database in the following image.  )

NewImage

Looks like the power outage REALLY messed up the SQL database.  But GoogleTechnician to the rescue!

Solarwinds Configuration Wizard – Attempt #1

So now I’m off to the Solarwinds Configuration tool ( on the console of the windows server ). For this attempt, I run the database configuration only. Thinking, I’ve got a database, issue, let’s just run the database configuration wizard and that should fix it, right?

NewImage

Nope… doesn’t look like this is going to work either

NewImage

Solarwinds Configuration Wizard – Attempt #2

So now I’m off to the Solarwinds database.  Hmm.. nothing on this error.

At this point, I just try what any good network guy does. I start clicking things and seeing if anything will work.

So this is what I did

  •  Logged into the Microsoft SQL Management console and reset the password on the SolarWindsOrionDatabaseUser account to something I knew.
  • Re-ran the Solarwinds Configuration Wizard. This time, instead of just the database, I’m going to re-run this for the Database, Web Site, and the Services.

note: Normally at this point, I would pull the plug, call the patient dead and re-install. But this was supposed to be a learning experience, right? We’re certainly learning now, aren’t we?

NewImage

Look like I’m back in business! Good to go right?

NewImage

Nope. now it’s time to remove the license, delete the VM and start from scratch. I don’t want a known corrupted system monitoring my network, even in a lab.

Hopefully, this blog will help someone with a production Solarwinds deployment who gets this same nasty SQL suspect database error.

Lesson to Learn

In a lab, sometimes things happen. Take the opportunity for the full learning experience when things go wrong. It’s always fun to see if we can bring a system back from the dead. But remember, once you’re done with the learning. Scrap it. This is not the system that I want to be evaluating as I will always be wondering “Hmmm… I wonder if this is normal or if this is a result of that bad install.”

Things go wrong. Known good clone images just have something funky. I’ve seen registry issues on brand new windows installation. SQL strangeness etc… None of which I feel like dealing with for longer than necessary. With how easy it is now to deploy a new VM from a template. There’s just no need to subject myself to this kind of long term pain.

So before I go to bed tonight, I’m going to start cloning a new windows image so that I can re-do the entire install tomorrow night on a clean VM.

FOR THE RECORD :  I’m 100% sure this is not a normal Solarwinds Orion NPM installation tale. I just happened to be the lucky one who was hand-selected by the universe as it thought ” Hmmm…  who can I REALLY mess with today? “.

Can’t wait for tomorrow.

@netmanchris

Juniper EX4200T- Management Observations

So I’ve had been spending some time playing around with a juniper EX4200T from a management standpoint.

This post is just a place to put some observations and questions. Hopefully, some Junos Peeps will be able to shed a little light on some of these questions.

First, as both a criticism and a defence; Juniper does not use SNMP as their primary interface. I get that SNMP has it’s problems, but it’s what we have and if you want to bring Juniper into a network where there is already a network management system in place, I would think that they should at least do the minimum to improve their SNMP support to at least meet the bar.

I have to say; as an operationally focused network engineer, it does disturb me that I can’t even set the sys location from a simple SNMP set command.   

ifIndex

One of the first things I noticed about the Juniper box is that the seem to have some strangeness, at least compared to other vendors, around the number of interfaces. Specifically, I’ve got a 48 port switch with more than twice that many interfaces.  Upon a closer look, it seems that the Juniper switches, or at the very least the EX 4200t, seems to have two index values for every physical port.

Juniper SNMP Interfaces

 

One of the interesting questions that come up here is ” What ifIndex value do I poll?”.  I’d like to get interface stats on this device, but do I poll the ethernet port, or the prop virtual port?  And if both return the same values; Why would I chose one over the other?  

Anyone have a good explanation of WHY they went this direction? @steve did suggest to think of this like a sub interface in Cisco terms.  I’ve been trying to figure this out, but the most common reason I’ve used a sub-interface has been to create dot1q routing on a stick configurations.  I don’t see how that applies here?

 

MIB Walking 

Another strange thing is that it seems that the EX4200 cannot return all the interfaces when reading the ifTable by SNMP.  It may be that this is an issue with my MIB browser, but it’s definitely a pain in the butt.  

Junos-peeps: Anyone have a MIB browser that works here? Suggestions on code? Possibly a bug?

 

VLAN 0

One of the other things I noticed is that the default VLAN of the EX4200 is 0. Huh? VLAN 0? All of the interfaces on the switch belong to VLAN 0 initially.  I did find this article  from the Juniper website says that ” Some attached devices may not accept 802.1q-tagged frames, and therefore can reside only in VLAN 0.” 

Coming from a Cisco and HP background, I’ve always seen the native VLAN initially on a interface listed as VLAN 1.    Anyone able to explain this to me?

VLAN-Range: Anyone able to explain this to me? Now I checked the Juniper documentation .  But I wasn’t able to find an article which explained what exactly the function is for. 

 

If anyone has comments, I’d love to learn here. I freely admit I haven’t had time to get far enough into this to understand the benefits and I do bring the baggage of history to my perspective on this.  If someone has made the jump to Junos, I’d love to hear from you! 

 

@netmanchris

Device Instrumentation

Not all devices are created equal.

I know this seems like a piece of Captn’ Obvious wisdom, but it bears thinking about a little in context of network management.

One of the things which I see all the time is someone asking to do XYZ on the device. Whether that’s pull serial numbers from power supplies, or read the sticker on the back of a switch. There are some things that are just outside the realm of possibility, or would just be to difficult to put into place.

If you are seriously looking at implementing an NMS, you need to get friendly with SNMP. Simple Network Management Protocol is probably the most common management protocol on the planet.

To be honest, SNMP is a second language and I would highly recommend anyone who wants to get SERIOUS about network management pick up a book or two and start learning it.  SimpleWeb has some tutorials, podcasts, and slide decks that they make available which may be a good place to start. 

In a nutshell, SNMP MIBs fall into two major categories

Public – These are the standard MIBs that are defined by the IETF. These are your friends, the bridge MIB, dot3 MIB, Entity MIB, etc..  MOST vendors should support these.

Private – Also referred to as Enterprise, these are the MIBs which Vendors write to support their own device specific functionality.

Occasionally, someone brings in a non-snmp capable device and asks for it to be monitored. And then they complain because you can’t make the same pretty graphs.

If it’s not instrumented in the device, we can’t do anything with it.

Let me say that again…

If it’s not instrumented in the device, we can’t do anything with it.

Here’s an example: Say someone comes to you and says ” Hey! Can you please tell me what the serial number is on the power supply in XYZ vendors chassis switch?”

I check the MIBs and it seems that XYZ vendor hasn’t instrumented serial numbers as one of the piece of information which they make available. So the answer is ” No, I can’t”

Then they complain that this NMS stuff, or the specific NMS product sucks. Remember

If it’s not instrumented in the device, we can’t do anything with it.

SDN. Who’s going to run it?

Big credit goes to @cloudtoad for putting together this thought provoking-post over at http://www.packetpushers.net.

He makes some very interesting comments and observations, none of which I can actually disagree with.

SDN is a business dream; Where they can buy commodity hardware, do away with high priced router-jockeys, stop paying a premium for mid-range value products, and just focus on whatever their core business actually is.

Unfortunately, I think there’s a lot more to this discussion that I haven’t seen a lot of people address yet. I’m not saying I have any answers here, but I do have a few questions. Hopefully, some of you will post in the comments as to how you see this playing out, because my crystal ball is getting pretty foggy these days. 🙂

1) Skills Gap: It’s been 20 +/- years since the fallacies of distributed computing were laid out. And yet, we’re seeing the past repeat itself again and again. I was asking about the DevOps trend with a customer a few weeks ago and he laughed at the question. They had already tried it he said.

And it failed miserably.

These guys are a startup doing some pretty impressive stuff with BigData ( Hadoop ) and they have a lot of talented coders on staff. This really got me curious. So I asked him “Why?”

His response, which I think applies equally to SDN as it does to DevOps, was the following

” It took you 10-15 years to become a really good network guy. It took them”
pointing over shoulder
” 10-15 years to become really good programmers.”

” I’ll guess you aren’t a good coder, but I can TELL you that they aren’t good network people.”

This has been racing around in my brain for weeks now. Other than the odd exception like @lynxbat

( check out his awesome VMware cloud demo here

I think it’s safe to say that 99% of the network engineers I know are capable of nothing more than rudimentary scripting, and most of that is regurgitated code from examples downloaded off the Internet.

I have a hard time calling someone who downloads a perl script and hacks in a couple of locally significant values a programmer, And yet this is very much the world we are all talking about moving to.

So where are these new breed of GUI-jockeys going to come from? With the hybrid of both coder and network knowledge that they will deliver us from our current state of one-protocol-per-problem. Because sadly, I see a shortage of good network folk in general, let alone good network folk with coding skills.

We’ve been slowly automating out all the jobs that green networks kids used to cut their teeth on. So where are these new wizards of SDN going to get their network experience to learn the valuable “just because you can…” lessons that all of us have over the years?

More than likely, they are going to make some snide comment, as the young are prone to do, on how our ipv4 knowledge, just doesn’t apply here anymore. Offer us a piece of tin can with a string, and offer to write us a new protocol.

all I can tell you for sure?

“Not on my network.” : )

Is it just me? does anyone else see this as a problem? And if so, what are you doing to prepare yourself for the coming divide?

@netmanchris