Sometimes Size Matters: I’m sorry, but you’re just not big enough.

So now that I’ve got your attention…I wanted to put together some thoughts around a design principal of what I call the acceptable unit of loss, or AUL. 

Acceptable Unit of Loss: def. A unit to describe the amount of a specific resource that you’re willing to lose

 

Sounds pretty simple doesn’t it?  But what does it have to do with data networking? 

White Boxes and Cattle

2015 is the year of the white box. For those of you who have been hiding under a router for the last year, a white box is basically a network infrastructure device, right now limited to switches, that ships with no operating system.

The idea is that you:

  1. Buy some hardware
  2. Buy an operating system license ( or download an open source version) 
  3. Install the operating system on your network device
  4. Use DevOps tools to manage the whole thing

add Ice. Shake and IT operational goodness ensures. 

Where’s the beef?

So where do the cattle come in?  Pets vs. Cattle is something you can research elsewhere for a more thorough dealing, but in a nutshell, it’s the idea that pets are something that you love and care for and let sleep on the bed and give special treat to on Christmas.  Cattle on the other hand are things you give a number, feed from a trough, and kill off without remorse if a small group suddenly becomes ill. You replace them without a second thought. 

Cattle vs. Pets is a way to describe the operational model that’s been applied to the server operations at scale. The metaphor looks a little like this:

The servers are cattle. They get managed by tools like Ansible, Puppet, Chef, Salt Stack, Docker, Rocket, etc… which at a high level are all tools which allow for a new version of the server to be instantiated on a very specific configuration with little to no human intervention. Fully orchestrated,   

Your servers’s start acting up? Kill it. Rebuild it. Put in back in the herd. 

Now one thing that a lot of enterprise engineers seem to be missing is that this operational model is predicated on the fact that you’re application has been built out with a well thought out scale-out architecture that allows the distributed application to continue to operate when the “sick” servers are destroyed and will seamlessly integrate the new servers into the collective without a second thought. Pretty cool, no?

 

Are your switches Cattle?

So this brings me to the Acceptable Unit of Loss. I’ve had a lot of discussions with enterprise focused engineers who seem to believe that Whitebox and DevOps tools are going to drive down all their infrastructure costs and solve all their management issues.

“It’s broken? Just nuke it and rebuild it!”  “It’s broken? grab another one, they’re cheap!”

For me, the only way that this particular argument that customers give me is if there AUL metric is big enough.

To hopefully make this point I’ll use a picture and a little math:

 

Consider the following hardware:

  • HP C7000 Blade Server Chassis – 16 Blades per Chassis
  • HP 6125XLG Ethernet Interconnect – 4 x 40Gb Uplinks
  • HP 5930 Top of Rack Switch – 32 40G ports, but from the data sheet “ 40GbE ports may be split into four 10GbE ports each for a total of 96 10GbE ports with 8 40GbE Uplinks per switch.”

So let’s put this together

Screen Shot 2015 03 26 at 10 32 56 PM

So we’ll start with

  • 2 x HP 5930 ToR switches

For the math, I’m going to assume dual 5930’s with dual 6125XLGs in the C7000 chassis, we will assume all links are redundant, making the math a little bit easier. ( We’ll only count this with 1 x 5930, cool? )

  • 32 x 40Gb ports on the HP 5930 – 8  x 40Gb ports saved per uplink ) = 24 x 40Gb ports for connection to those HP 6125XLG interconnects in the C7000 Blade Chassis.
  • 24 x 40Gb ports from the HP 5930 will allow us to connect 6 x 6125XLGs for all four 40Gb uplinks. 

Still with me? 

  • 6 x 6125XLGs means 6 x C7000 which then translates into 6*16 physical servers.
Just so we’re all on the same page, if my math is right; we’ve got 96 physical servers on six blade chassis connected through the interconnects at 320Gb ( 4x40Gb x 2 – remember the redundant links?) to the dual HP 5930 ToR switches which will have (16*40Gb – 8*40Gb from each HP 5930) 640Gb of bandwidth out to the spine.  .  

If we go with a conservative VM to server ratio of 30:1,  that gets us to 2,880 VMs running on our little design. 

How much can you lose?

So now is where you ask the question:  

Can you afford to lose 2,880 VMs? 

According to the cattle & pets analogy, cattle can be replaced with no impact to operations because the herd will move on with out noticing. Ie. the Acceptable Unit of Lose is small enough that you’re still able to get the required value from the infrastructure assets. 

The obvious first objection I’m going to get is

“But wait! There are two REDUNDANT switches right? No problem, right?”

The reality of most of networks today is that they are designed to maximize the network throughput and efficient usage of all available bandwidth. MLAGG, in this case brought to you by HPs IRF, allows you to bind interfaces from two different physical boxes into a single link aggregation pipe. ( Think vPC, VSS, or whatever other MLAGG technology you’re familiar with ). 

So I ask you, what are the chances that you’re running the unit at below 50% of the available bandwidth? 

Yeah… I thought so.

So the reality is that when we lose that single ToR switch, we’re actually going to start dropping packets somewhere as you’ve been running the system at 70-80% utilization maximizing the value of those infrastructure assets. 

So what happens to TCP based application when we start to experience packet loss?  For a full treatment of the subject, feel free to go check out Terry Slattery’s excellent blog on TCP Performance and the Mathis Equation. For those of you who didn’t follow the math, let me sum it up for you.

Really Bad Things.  

On a ten gig link, bad things start to happen at 0.0001% packet loss. 

Are your Switches Cattle or Pets?

So now that we’ve done a bit of math and metaphors, we get to the real question of the day: Are you switches Cattle? Or are they Pets? I would argue that if your measuring your AUL in less that 2,000 servers, then you’re switches are probably Pets. You can’t afford to lose even one without bad things happening to your network, and more importantly the critical business applications that are being accessed by those pesky users. Did I mention they are the only reason the network exists?

Now this doesn’t mean that you can’t afford to lose a device. It’s going to happen. Plan for it. Have spares, Support Contracts whatever. But my point is that you probably won’t be able to go with a disposable infrastructure model like what has been suggested by many of the engineers I’ve talked to in recent months about why they want white boxes in their environments.

Wrap up

So are white boxes a bad thing if I don’t have a ton of servers and a well architected distributed application? Not at all! There are other reasons why white box could be a great choice for comparatively smaller environments. If you’ve got the right human resource pool internally with the right skill set, there are some REALLY interesting things that you can do with a white box switch running an OS like Cumulus linux. For some ideas,  check out this Software Gone Wild podcast with Ivan Pepelnjak and Matthew Stone.

But in general, if your metric for  Acceptable Unit of Loss is not measured in Data Centres, Rows, Pods, or Entire Racks, you’re probably just not big enough. 

 

Agree? Disagree? Hate the hand drawn diagram? All comments welcome below.

 

@netmanchris

Introduction to R and SWIRL

So I’m taking a Cousera course from John Hopkins on Data Science.

The course uses the R programming language which is a derivative of the S programming language that came out of Bell labs in the 70’s. I’m a huge believer in network programability and SDN in general. From a traditional  Network Management point of view, most of the work getting done and discussed today is really around the C in the FCAPS model. There are some people, like Jason Edelman, Matt Oswald, etc… who are using network programability for automating troubleshooting tasks, but most of those are pretty straight forward

  • automate information gathering
  • automate troubleshooting
  • Identify the corrective action

Once you’ve got the corrective action nailed down, you could also automate the fix, but there are a lot of people who are still nervous about having changes happen without a human being involved. 

Automating configuration management and configuration based fault detection and error correction are great things. But there are other parts of the network that can benefit from the application of a programming language to old problems. 

I’m personally interested in the massive amounts of data that the network holds. We’ve got a ton of instrumentation within the network that is just setting there to be accessed, tracked, and mined for useful insights. 

Data Science is all about different methods to scroll through all the data in a scientifically reproducable manner, hopefully gain some insights.

The Tools

Like python, R has an IDE available that will allow you to run R code interactively, or through R files. It can be downloaded at the CRAN site here

There’s also a better IDE available called R Studio that allows some additional functionality which is available here 

SWIRL is a library which allows learners to access some interactive tutorials written in R for R. There’s a GIT repository here which provides a set of tutorials for different courses that allows you to get a feel for the language syntax, creating functions, etc…  

 

R Studio and Swirl

Once you install the SWIRL library, which is really easy using the RStudio Install Packages feature, you load the SWIRL library ( think IMPORT in Python ) using the library(swirl) function. Once you’ve done that, you can either download the course files from the GIT repository, or you can install directly from within R ( uses CURL in the background to download the files directly into your working directory ). 

As you can see in the screen capture, I’ve got a few different course installed, and each of the courses has a bunch of lessons inside them. The screen capture shows the lessons within the R Programming course. What’s also cool about this is, assuming that you’re enrolled in the Coursera R Programming course, you can complete the lesson, input your username and password ( specific to your course, not your cousera password ) and magically, you get extra credit for the course lessons you complete.   

Extra credit is a good thing.

 

Wrap Up

I’ve only been into R for about a week. It’s got some nice features, but to be honest, I don’t have enough coding experience to really give a qualified opinion on the subject. I’ll continue to work with it and see where things go.  There’s still a ton of python that I need to learn, but I’ve already found a native python library called rpy2 that allows me to access native R libraries from within my python code. Best of both words I guess. 🙂

 

Bringing Wireless Back in to the Fold

I’m sitting in the airport in Barcelona just having had an amazing week of conversations ranging from potentially core-belief shattering to crazy ideas for puppet shows. The best part of these events, for those of us who are social, is the ability to interact with people in meatspace that we’ve already “known” for a period of time on twitter. I had the pleasure this week of hanging out with such luminaries of the networking social scene like Tom Hollingsworth (@networkingnerd ), Amy Arnold (@amyengineer), Jon Herbert (@mrtugs ), Ethan Banks @ecbanks and, not to be left out of any conversation, Mr. Greg Ferro.

 

There were a lot of great conversations, and more than a couple of packet pushers shows recorded during the week but the one that’s sticking in my mind right now is a conversation we had around policy and wireless. This has been something on my mind now for awhile and I think I’ve finally thought this through enough to put something down on paper.

Before we get started, I think it’s important that everyone understand that I”m not a wireless engineer, I’m making some assumptions here that I”m hoping that will be corrected in the comments if I’m headed in the wrong direction.

 

Wireless: The original SDN

So in many ways, wireless networking have been snickering at us wired lovers for our relatively recent fascination with SDN. Unsurprisingly, I’ve heard a lot of snark on this particular subject for quite awhile. The two biggest being:

  • Controller Based networking? That’s sooooooo 2006. Right?
  • Overlays?  We’ve been creating our own topologies independent of the physical layout of the network for years!!!!

 

I honestly can’t say I disagree with them in principle, but I love considering the implications of how SDN, combined with the move to 802.11ac is going to really cause our words to crash back together.

 

I’m a consumer of wireless. I love the technology and have great respect for the witchdoctor network engineers who somehow manage to keep it working day-in and day-out. I’m pretty sure where I have blue books on my book shelf, they have a small alter to the wireless gods. I poke fun, but it’s just such a different discipline requiring intense knowledge of the transmission medium that I don’t think a lot of wired engineers can really understand how complicated wireless can be and how much of an art form that creating a good stable wireless design actually is.

On a side note, I heard this week that airplanes actually use sacks of potatoes in their airplanes when performing wireless surveys to simulate the conditions of passengers in the seats. If that doesn’t paint a picture of the differences with wireless, I don’t know what does.

 

The first wireless controller I had a chance to work with was the Trapeze solution back in approx 2006. It was good stuff. It worked. It allowed for centralized monitoring, not to mention centralized application of policy. The APs were 802.11G and it was awesome. I could plug in an AP anywhere in the network and the traffic would magically tunnel back to the controller where I could set my QoS and ACLs to apply the appropriate policies and ensure that users were granted access and priority, or not, to the resources that I wanted. Sounds just like an Overlay doesn’t it?

In campus environments, this was great. An AP consumed a theoretical bandwidth of 54Mbps and we had, typically, dual Gig uplinks. If we do some really basic math here, we see the following equation

Screen Shot 2014 12 05 at 1 08 31 PM

 

Granted, this is a napkin calculation to make a point.  But you can see it would be REALLY hard to oversubscribe the uplinks with this kind of scenario.  There weren’t that many wireless clients at the time. AP density wasn’t that bad. 2.4 Ghz went pretty far and there wasn’t much interference.

Screen Shot 2014 12 05 at 1 09 14 PM

 

Hmmm… things look a little different here.  Now there are definitely networks out there that have gone  to 10Gb connections between their closets in the campus. But there are still substantial amount that are still running dual gig uplinks between their closets and their core switches. I’ve seen various estimates, but consensus seems to suggest that most end-stations connected to the wireless network consume, on average, about 10% of the actual bandwidth. Although I would guess that’s moving up with the rich media (video) getting more and more widely used.

Distributed Wireless

We’ve had the ability to allow the wireless APs to drop the wireless client traffic directly on to the local switch for years. Although vendors have implemented this feature at different times in their product life cycles. I think it’s safe to say this is a me-too feature at this point. I don’t see it implemented that much though because, in my opinion, having a centralized point in the network, aka. the controller, were I can tunnel all of my traffic back to allows me to have a single point to apply policy. Because of the limited bandwidth, we could trade off the potential traffic trombone of wireless going back to the controller to access local resources for the simplicity of centralized policy.

Now that a couple of 802.11ac access points can potentially oversubscribe the uplinks on your switch, I think we’re quickly going to have to rethink that decision. Centralized policy not only won’t be worth the cost of the traffic trombone, but I would argue it’s just not going to be possible because of the bandwidth constraints.

 

I’m sure some people who made the decision to move to 10Gb uplinks will continue to find centralized policy to be the winner of this decision, but for a large section of network designers, this just isn’t going to be practical

Distributed Policy

This is where things start to get really interesting for me. Policy is the new black. Everyone’s talking about it. Promise Theory, Declaritive Intent. Congress, etc… There are a lot of theories and ideas out there right now and it’s a really exciting time to be in networking. I don’t think this is going to be a problem we solve overnight, but I think we’re going to have to start creating the foundation now with more consistent design and configurations allowing us to provide a consistent, semi homogenous foundation when we start to see the policy discussion resulting in real products.

What do I mean by this? Right not there, really two big things that will help to drive this forward.

Globally Significant, but not Unique VLANS

Dot1x, or more accurately, the RADIUS protocol, allows us to send back a tunnel-group ID attribute in the RADIUS response that corresponds to a VLAN ID ( name or dot1q tag are both valid ). We all know the horrors of stretched VLANS, but there’s no reason you can’t refuse the same VLAN number in various places in the network as long as they have a solid L3 boundary in between them and are assigned different L3 address space. This means that we’re going to have to move back towards L3 designs and turn to configuration tools to ensure that VLAN ids and naming conventions are standardized and enforced across the global network policy domain.

Consistent Access Control Lists and QoS Policies

RADIUS can also send back a specific attribute in the RADIUS response that will tell the switch put apply a specific ACL or QoS policy to the authenticated connection for the session time of that connection. Some vendors, but not all, allow for the dynamic instantiation of the ACL/QoS policy, but most still require the ACL or QoS construct to be already present in the network device before the RADIUS policy can consume that object. This means we’re going to be forced to turn to configuration management tools to make sure that these policy enforcement objects are present in all of the network devices across the network, regardless of the medium.

 

The future

I think we’re swiftly arriving at a point where wireless can not be designed in a vacuum as an overlay technology. The business need policy to be consistently applied across the board and bandwidth to be available and efficiently used.  I don’t see any other way for this to be done without ensuring that the we start to ignore the medium type that a client is connecting on.  On the bright side, this should result in more secure, more flexible, and more business policy driven wired connectivity in the coming years. I don’t believe we’ll be thinking about how the client connected anymore. We won’t care.

 

Agree? Disagree? Did I miss something? Feel free to comment below!

@netmanchris

Working with PYSNMP con’t – SNMP simple SETs

So inspired by @KirkByers blog on SNMP which I wrote about here using Python 3, I decided that I wanted to go past just reading SNMP OIDs and see how PYSNMP could be used to set SNMP OIDs as well.  As with anything, it’s best to start with the docs and apply the KISS principle.

Why SNMP?

Although RESTful APIs are definitely an easier, more human readable, way to get data in and out of a network device, not all network devices support modern APIs. For a lot of the devices out there, you’re still stuck with good ol’ SNMP.

Tools

SNMP is a horrible human readable language. We have MIBS to translate those nasty strings of digits into something that our mushy brains can make sense of. Thats known as a MIB Browser. There are a lot of MIB browsers out there, which one you use doesn’t matter much, but I would HIGHLY recommend that you get one if you’re going to start playing with SNMP.

http://www.ireasoning.com has a free version that works great on Windows and supports up to 10 MIBs loaded at the same time.

The Goal

There are a lot of powerful actions available through the SNMP interface, but I wanted to keep it simple.  For this project, I wanted to go after something simple. For this project, I wanted to use SNMP to change the SYSCONTACT and SYSLOCATION fields.

For those of you who are used to a CLI, the section of the config I’m looking after resembles this.

 snmp-agent sys-info contact contact

 snmp-agent sys-info location location

 

The Research

So I know what I want to change, but I need to know how I access those values though SNMP. For this, I’ll use the MIB Browser

 

Using the MIB Browser, I was able to find the SYSLOCATION MIB which is identified as .1.3.6.1.2.1.1.6.0

Screen Shot 2014 11 28 at 1 37 36 PM

I was also able to find the SYSCONTACT MIB which is identified as .1.3.6.1.2.1.1.4.0

Screen Shot 2014 11 28 at 1 37 14 PM

 

So now I’ve got the two OIDs that I’m looking for.

The Code

Looking through the PYSNMP documentation, there’s an example there of how to do an  SNMP SET. But they threw in a couple of options there that I didn’t want, specifically, I didn’t want to use the lookupNames option. So I changed the lookupNames option to False and then I was able to use the OIDs above directly without having to find the names of the MIBs.

So looking through the code below, you can see that I’ve created a function which will take an input syscontact and use it as the variable to set the MIB object  .1.3.6.1.2.1.1.4.0 which corresponds to the

SNMP-AGENT SYSCONTACT …  in the configuration of the device.

from pysnmp.entity.rfc3413.oneliner import Camden

cmdGen = cmdgen.CommandGenerator()

#using PYSNMP library to set the network devices SYSCONTACT field using SNMP
def SNMP_SET_SYSCONTACT(syscontact):
errorIndication, errorStatus, errorIndex, varBinds = cmdGen.setCmd(cmdgen.CommunityData(‘private’),cmdgen.UdpTransportTarget((‘10.101.0.221’, 161)),(cmdgen.MibVariable(‘.1.3.6.1.2.1.1.4.0’), syscontact), lookupNames=False, lookupValues=True)
# Check for errors and print out results
if errorIndication:
print(errorIndication)
else:
if errorStatus:
print(‘%s at %s’ % (errorStatus.prettyPrint(),errorIndex and varBinds[int(errorIndex)-1] or ‘?’))
else:
for name, val in varBinds:
print(‘%s = %s’ % (name.prettyPrint(), val.prettyPrint()))

 

 Running the Code

Now we run the code

>>>
>>> SNMP_SET_SYSCONTACT(‘test@lab.local’)
1.3.6.1.2.1.1.4.0 = test@lab.local
>>>

Looking back at the MIB Browser we can see that the SYSCONTACT location has been changed to the new value above.

Screen Shot 2014 11 28 at 2 24 50 PM

And when we log back into the network device

 snmp-agent sys-info contact test@lab.local

 snmp-agent sys-info location location

 

Wrap Up

This is just a small proof-of-contact code that shows, although RESTful APIs are definitely sweeter to work with, they are not the only way to programatically interface with network devices.

Comments or Questions?  Feel free to comment below

 

@netmanchris

 

 

 

Surfing your NMS with Python

Python is my favourite programming language. But then again, it’s also the only one I know. 🙂

I made a choice to go with python because, honestly, that’s what all the cool kids were doing at the time. But after spending the last year or so learning the basics of the language, I do find that it’s something that I can easily consume, and I’m starting to get better with all the different resources out there. BTW http://www.stackoverflow.com is your friend.  you will learn to love it.

On with the show…

So in this post, I’m going to show how to use python to build a quick script that will allow you to issue the RealTimeLocate API to the HP IMC server. In theory, you can build this against any RESTful API, but I make no promises that it will work without some tinkering.

Planning the project.

I’ve written before how I’m a huge fan of OPML tools like Mind Node Pro.  The first step for me was planning out the pieces I needed to make this:

  • usable in the future
  • actually work in the present
In this case I’m far more concerned about the present as I’m fairly sure that I will look back on this code in a year from now and think some words that I won’t put in print.
Aside: I’ve actually found that using the troubleshooting skills I’ve honed over the years as a network engineer helps me immensely when trying to decompose what pieces will need to go in my code. I actually think that Network Engineers have a lot of skills that are extremely transportable to the programming domain. Especially because we tend to think of the individual components and the system at the same time, not to mention our love of planning out failure domains and forcing our failures into known scenarios as much as possible.

Screen Shot 2014 11 24 at 9 33 50 PM

Auth Handler

Assuming that the RESTful service you’re trying to access will require you to authenticate, you will need an authentication handler to deal with the username/password stuff that a human being is usually required to enter. There are a few different options here. Python actually ships with URLLIB or some variant depending not the version of python you’re working with.  For ease of use reasons, and because of a strong recommendation from one of my coding mentors, I chose to use the REQUESTS library.  This is not shipped by default with the version of python you download over at http://www.python.org but it’s well worth the effort over PIP’ing it into your system.

The beautiful thing about REQUEST’s is that the documentation is pretty good and easily readable.

In looking through the HP IMC eAPI documentation and the Request library – I settled on the DigestAuth

Screen Shot 2014 11 24 at 10 17 07 PM

So here’s how this looks for IMC.

Building the Authentication Info

>>>import requests   #imports the requests library you may need to PIP this in if you don’t have it already

>>> from requests.auth import HTTPDigestAuth    # this imports the HTTPDigestAuth method from the request library.
>>>
>>> imc_user = ”’admin”’   #The username used to auth against the HP IMC Server
>>> imc_pw = ”’admin”’   #The password of the account used to auth against the HP IMC Server.
>>>  

auth = requests.auth.HTTPDigestAuth(imc_user,imc_pw)     #This puts the username and password together and stores them as a variable called auth

We’ve now built the auth handler to use the username “admin” with the password “admin”. For a real environment, you’ll probably want to setup an Operator Group with only access to the eAPI functions and lock this down to a secret username and password. The eAPI is power, make sure you protect it.

Building the URL

So for this to work, I need to assign a value to the host_ip  variable above so that the URL will complete with a valid response. The other thing to watch for are types. Python can be quite forgiving at times, but if you try to add to objects of the wrong type together… it mostly won’t work.  So we need to make sure the host_ip is a string and the easiest way to do that is to put three quotes around the value.

In a “real” program, I would probably use the input function to allow this variable to be input as part of the flow of the program, but we’re not quite there yet.

>>> host_ip = ”’10.101.0.109”’   #variable that you can assign to a host you want to find on the network
>>> h_url = ”’http://”’    #prefix for building URLs use HTTP or HTTPS
>>> imc_server = ”’10.3.10.220:8080”’   #match port number of IMC server default 8080 or 8443
>>> url = h_url+imc_server    #combines the h_url and the IP address of the IMC box as a base URL to use later
>>> find_ip_host_url = (”’/imcrs/res/access/realtimeLocate?type=2&value=”’+host_ip+”’&total=false”’)   # This is the RealTimeLocate API URL with a variable set
>>>

Putting it all together.

This line takes puts the url that we’re going to send to the web server all together. You could ask “Hey man, why didn’t you just drop the whole string in one variable to begin with? “   That’s a great question.  There’s a concept in programming called DRY. (Don’t Repeat Yourself).  The idea is that when you write code, you should never write the same thing twice. Think in a modular fashion which would allow you to reuse pieces of code again and again.

In this example, I can easily write another f_url variable and assign to it another RESTful API that gets me something interesting from the HP IMC server. I don’t need to write the h_url portion or the server IP address portion of the header.  Make sense?

>>> f_url = url + find_ip_host_url
>>>    #  This is a very simple mathematical operation that puts together the url and the f_url which will product the HTTP call. 

Executing the code.

Now the last piece is where we actually execute the code. This will issue a get request, using the requests library.  It will use the f_url as the actual URL it’s going to pass, and it will use the variable auth that we created in the Authentication Info step above to automatically populate the username and password.

The response will get returned in a variable called r.

>>> r = requests.get(f_url, auth=auth)    #  Using the requests library get method, we’re going to pass the f_url as the argument for the URL we’re going to access and pass auth as the auth argument to define how we authenticate Pretty simple actually . 
>>>

The Results

So this is the coolest part. We can now see what’s in r.  Did it work? Did we find out lost scared little host?  Let’s take a look.

>>> r
<Response [200]>

Really? That’s it? .

The answer is “yes”.  That’s what’s been assigned to the variable r.  200 OK may look familiar to you voice engineers who know SIP and it means mostly the same thing here. This is a response code to let you know that your request was successful – But not what we’re looking for. I want that content, right?  If I do a type(r) which will tell me what python knows about what kind of object r is I will get the following.

>>> type(r)

<class ‘requests.models.Response’>

So this tells us that maybe we need to go back to the request documentation and look for info on the responses. Now we know to access the part of the response that I wanted to see, which is the reply to my request on where the host with ip address 10.101.0.111 is actually located on the network.

So let’s try out one of the options and see what we get

>>> r.content
b'<?xml version=”1.0″ encoding=”UTF-8″ standalone=”yes”?><list><realtimeLocation><locateIp>10.101.0.111</locateIp><deviceId>4</deviceId><deviceIp>10.10.3.5</deviceIp><ifDesc>GigabitEthernet1/0/16</ifDesc><ifIndex>16</ifIndex></realtimeLocation></list>’

How cool is that. We put in an IP address and we actually learned four new things about that IP address without touching a single GUI. And the awesome part of this?  This works across any of the devices that HP IMC supports.

Where to from here?

So we’ve just started on our little journey here.  Now that we have some hints to the identity of the network devices and specific interface that is currently harbouring this lost host, we need to use that data as hints to continue filling in the picture.

But that’s in the next blog…

Comments or Questions?  Feel free to post below!

Working with RESTFul APIs

There are a lot of talk about APIs right now.  Every vendor has an API, but not all are created equal. What does an API even mean?  I’m not going to get too wrapped around definitions. But I’ll provide you a link

A more formal definiition of REST may be found here.  For my purposes, I propose the following:

RESTful API

Something that I can work with using the HTTP protocol and probably returns data in XML or JSON.

 

Some examples

I’m working with HP’s Intelligent Management Center and it’s eAPI, which offers a RESTful interface to the network management system which will return both XML and JSON.

Here’s an example of a call using XML

Screen Shot 2014 11 24 at 8 38 21 PM

Update: For those who noticed – The URL for both are the same. But the content of the HTTP request actually shows a slightly different story

Here’s the Wireshark of the XML request. If you look at the trace below, you can see the Accept: application /xml\r\n which shows that the request is asking for XML.

Screen Shot 2014 11 27 at 8 47 04 PM

Here’s an example of a call using JSON.

Screen Shot 2014 11 24 at 8 38 39 PM

 

Here’s the JSON request which is essentially the same except the accept: portion in this shows application as JSON.

Screen Shot 2014 11 27 at 8 46 39 PM

 

 

 

As you can see they carry the exact same data, but the way the data is structured is slightly different.  From what I can tell, programatically speaking, there’s no difference in that you can definitely work with either one easily.

For right now, I’ve chosen to focus on the just XML under the theory that if I focus on figuring out how to work with just XML for now, I can go back and learn how to work with JSON later after my overall skills as a programer have increased.

 

It’s a theory.

 

Note: For those of you who haven’t seen this interface, it’s a standard ( at least for HP ) RS-Docs interface which provides the documentation for the RESTful interface directly on the machine and allows you to test it. This is also available for the HP SDN controller.  

For IMC. it can be accessed at  http://localhost:8080/imcrs   and will require you to authenticate. BTW the port numbers may also change if you installed in something other than the default state. 

 

RealTimeLocation API – Breaking it Down.

For those of you with keen eyes. You can probably guess that this particular API is used to locate a host on the network.  Let’s break it down a little so that we can see what the the return is actually telling us here.

 

<list> # This lets us know this is the start of the list

<realtimeLocation> # This lets us know the data below is about realtimeLocation

<locateIp>10.101.0.111</locateIp> # This is the IP address we wanted to locate

<deviceId>4</deviceId> # This is the device ID of the switch where we found it.

<deviceIp>10.10.3.5</deviceIp> # This is the IP address of the switch where we found it.

<ifDesc>GigabitEthernet1/0/16</ifDesc> # This is the interface description where we found it.

<ifIndex>16</ifIndex> # This is the ifIndex value of the interface where we found it.

</realtimeLocation> # This lets us know that the realtimeLocation data has ended.

</list> # This lets us know that the list has ended.

 

So a couple of quick notes about this list.

  • Device ID is an internal numbering scheme that HP IMC uses to keep track of the devices. This has no practical relation to anything outside the IMC system.
  • ifDesc  is the SNMP ifDesc.  You may be tempted to look at this and think “ That must be the description on the interface!!!” You would be wrong. The description you configure on the interface when you type in the command “description This_Is_My_Interface” is actually held in the ifAlias ( 1.3.6.1.2.1.31.1.1.1.18 ) . Blame SNMP for this one.
  • ifIndex is the SNMP ifIndex value. This is, again, an easy way for computers to keep track of the number of the port. Also, important to know that on some vendors devices, these values can change with a reboot. Cisco used to have this issue, but they do allow you to make them persist across reboot

 

Next Time

 

So this is a brief introduction into XML and an example of a RESTful API.  As you can see, it’s not that  intimidating. It’s actually almost readable by a human being.

In the next post. I’m going to look at building some basic python code to use this API directly.

 

Questions or Comments?  Please post below and I”ll be happy to do my best. Again, I’m student in progress here, so please take any answers I give with a grain of salt. If you’re further along on this journey that I am. Please feel free to suggest improvements in the comments as well. I wouldn’t say I’ve got no ego, but I”ll check it at that door if it helps me improve.

Network Developer: A network engineers Journey into Python

Like most other people in the networking industry, I’ve been struggling with answering the question as to whether or not Network Engineers need to become programmers. It’s not an easy question to answer and after a few years down this SDN journey, I’m still no closer to figuring out whether or not network engineers need to fall into one of the following categories

Become Full-Time Software Developers

DaveTucker

For those of you who don’t know @dave_tucker, he was a talented networking engineer who choose to make the jump to becoming a full time programmer. Working on creating consumption libraries using python for the HP VAN SDN Controller, contributing to the OpenDayLight controller, and now joined up with @networkstatic, another great example. and @MadhuVenugopal   to form SocketPlane focused on the networking stack in Docker. 

Gain some level of proficiency in a modern programming language

One of the people that i think has started to lead in this category is @jedelman8. Jason is a CCIE who glimpsed what the future may hold for some in our profession and has done a great job sharing what he’s been learning on his journey on his blog at http://www.jedelman.com/.  Definitely check it out if you haven’t already. 

This is also where I’ve chosen to be for now. The more I code, I think it’s possible that I could go full programmer, but I also love networking too much. I guess the future will tell with that one. 

For this category, this will mean putting in extra time on nights and weekends to focus on learning the craft.  As someone once told me, it takes about 10 years to become a really good network engineer, no one can be expected to become a good programmer in a year, especially not with a full time day-job. 

On the bright side there are a lot of resources out there like

Coursera.org – Just search for the keyword “python” and there are several good courses that can help you gain the basics of this language.

CodeAcademy.com – CodeAcademy has a focused python track that will allow you to get some guided hands on labs as long as you have an internet connection.

 pynet.twb-tech.com – @kirkbyers has put together an email led python course specifically for network engineers over at   He’s also got some great blogs  that discuss how to use python for different functions that are specifically related to network engineers day-to-day jobs. Having something relevant always helps to make you’re live easier. 

Gain the ability to think programmatically and articulate this in terms software developers understand

I don’t have any really good examples of this particular category.  For some reason, that has so far eluded me, there just isn’t many network engineers in this category. If you know of any great examples, please comment below and I’ll be happy to update the post!

This is where I was a coupe of years ago. I knew logic. I could follow simplistic code if it was already written, and I could do a good enough job communicating to my programming friends enough to ensure that the bottle of tequila I was bribing them with would most likely result in something like what I had in my head. 

 

Stay right where they are today. 

The star fish is one of the few creatures in the history of evolution that went “ Hmmm. I think I’m good! “   This isn’t a judgement, but you need to decide where you want to be and if Star Fish is it… you might find your future career prospects limited. 

starfish

 

 

Journey Ahead

 

As I get back into actually posting, I’m planning on sharing some of the simplistic code that I’ve been able to cobble together. I make no claims as to how good this code is, but I hope that it will inspire some one else reading this to take some classes, find a project, and then write and share some small script or program that makes their life just a little bit easier. Guys like Jason have done this for me. I recently hit a place where I finally have enough skills to be able to accomplish some of the the goals I had in mind. My code is crap, but it’s so simplistic that it’s easy to understand exactly what I’m doing.  And that’s where I think the value comes from sharing right now.

 

Comments or thoughts? Please feel free to comment below!