Using JSONSchema to Validate input

There are a lot of REST APIs out there. Quite a few of them use JSON as the data structure which allows us to get data in and out of these devices. There are a lot of network focused blogs that detail how to send and receive data in and out of these devices, but I wasn’t able to find anything that specifically talked about validating the input and output of the data to make sure we’re sending and receiving the expected information.

Testing is a crucial, and IMO too often overlooked, part of the Infrastructure as Code movement. Hopefully this post will help others start to think more about validating input and output of these APIs, or at the very least, spend just a little more time thinking about testing your API interactions before you decide to automate the massive explosion of your infrastructure with a poorly tested script. 🙂

What is JSONSchema

I’m assuming that you already know what JSON is, so let’s skip directly to talking about JsonSchema. This is a pythonlibrary which allows you to take your input/output  and verify it against a known schema which defined the data types you’re expecting to see.

For example, consider this snippet of a schema which defines what a valid VLAN object looks like

"vlan_id":
{
    "description": "The unique ID of the VLAN. IEEE 802.1Q VLAN identifier (VID)", 
    "minimum": 1, 
    "maximum": 4094, 
    "type": "integer", 
    "sql.not_null": true
}

You can see that this is a small set of rules that defines what is a valid entry for the vlan_id property of a VLAN.  As a network professional, it’s obvious to us that a valid VLAN ID  must be between 1 and 4094. We know this because we deal with it every day. But our software doesn’t know this. We have to teach it what a valid vlan_id property looks like and that’s what the schema does.

our software doesn’t know this. We have to teach it

Why do we care?

Testing is SUPER important. By being able to test the input/output of what you’re feeding into your automation/orchestration framework, it can help you to avoid, at worst, a total meltdown, or, at best, a couple of hours trying to figure out why your code doesn’t work.

Using JSONSchema

So the two things you’re going to need to use JSONSchema are

  • The JSON Schema for a specific API endpoint
  • The input/output that you want to validate.

In this case, we’ll use a VLAN object that is coming out of an ArubaOS-Switch REST API.

You did know the ArubaOS-Switches have a REST API, right?

Step1 – Loading the VLAN object

We’re going to gather the output from the VLANS API. Instead of writing custom code, we’ll just use the pyarubaoss library. I’ll leave you to check out the GitHub repo and just paste the contents of the output of a single VLAN JSON output here. I’m also going to create a second VLAN with a VLAN_ID of 5000. Just to show how this works. 5000 of course, is not valid and we’d like to prove that. Right?

Step 2 – Loading the JSON Schema Definition

Now we have the output, we want to make sure that the output here complies with the JSON Schema definition that we’ve been provided.

Loading the JSON schema

Here’s a sub-set of the JSON schema that defines what a valid VLAN looks like

Step 3 – Importing the JSON Schema Library and Validating

Now we’re going to load the JSON Schema library into our python session and use it to validate the VLAN object using the Schema we defined above.

First we’ll look at the vlan_good object and run it through the validate function

As you can see, there’s nothing to see here. Basically this means that the vlan_good object is conforming properly to the provided JSON Schema. The VLAN ID is valid as it’s a integer value between 1 and 4094

Now let’s take a look at the vlan_bad object and run it through the same validate function

We can see that the validate function now raises an exception because and let’s us know very specifically that the VLAN ID 5000 is not valid

jsonschema.exceptions.ValidationError: 5000is greater than the maximum of 4094

Pretty cool right? We can still definitely shoot ourselves in the foot, but at least we know the input/output data that we’re using for our API is valid. To me this is important for two reasons

  • I can validate that the data I’m trying to send to a given API conforms to what that API is expecting
  • I can validate that the vendor didn’t suddenly change their API which is going to break my code

Wrap Up

There are a lot of networking folk who have started to take on the new set of skills required to automate their infrastructure. One of the most crucial parts of this is testing and validating the code to ensure that you’re not just blowing up your network more efficiently. JSON Schema is one of the tools in your tool box that can help you do that.

Comments, questions? Let me know

@netmanchris

Advertisements

XML, JSON, and YAML… Oh my!

I”m a network engineer who codes. Maybe even a network coder. Probably not a a network programmer. Definitely not a programer who knows networking.  I’m in that weird zone where I’m enough of two things that don’t normally go together that it makes conversations I”m having with some of my peers awkward.

I had one such conversation today trying to explain the different data serializations modes in python and why, at the end of the day, they really don’t matter.

The conversation started with one of those “But they have an XML API!!!” comments thrown out as a criticism of someone’s product. My response was something like “ And why does that matter? ”

The person who made the comment certainly couldn’t answer that question. It was just something they had read in a competitive deck somewhere.

I’m all about competing and trying to make sure that the customer’s have the BEST possible information to make the best decisions for their particular requirements, but this little criticism was definitely not, IMHO, the best information. In fact, it was totally irrelevant.   This post is my way of trying to explain why. Hopefully, this will help clear up some of the confusion around data structures and APIs and why they really don’t matter so much, as least not their formatting.

XML

You can read more about XML here. In a nutshell,  XML uses tags, similar to HTML to represent different values in your data stream.  the <item> opens up an item and the </item> closes the item, and what lives between the two is the value for that item. Take a look at the following XML output from the HP IMC NMS. I just cut and paste this straight out of the API interface, so you should be able to do the same if you want to follow along at home.  In this code, I have created a string called x and pasted in the XML formatted text which is a bunch of information about a Cisco 2811 router that lives in my lab. Pay attention to the values as they will stay the same going through this exercise.

XML is the oldest of the bunch, being a W3C recommendation in 1998. It’s important to note though that XML is still relevant, being the native data format of Netconf and still used in a lot of places. It’s old, but that doesn’t mean devoid of value.

Ordered Dictionary

A dictionary is a way of storing data in python that uses keys instead of an index to access the content or value of a specific piece of information you want. Example item[‘ip’] would return “10.101.0.1” with a dictionary.

One of the “issues” with dictionaries is that they are unordered. That means that there’s no guaranty that when you print out a dictionary that the values will be in the same order. ( Pretty obvious when you read the word “unordered” I know.) The OrderedDictionary is a “ dict subclass that remembers the order entries were added”.  So we’re going to use a great little library called xmltodict which takes an XML string ( called x ) above and transforms it into a python ordered dictionary. Now we can do interesting things to it in python. We can access they keys and get to the values directly. We can iterate over top of it because it’s one of pythons native data structures. It’s easy to use. People know it and understand it. It’s a good thing. Lists and dictionaries are the bread and butter of data structures in python. You need, need, need them.

In this code example, we’re going to take the XML string from above, run it through the xmltodict to convert it to an ordered dictionary and assign it to the variable y.  Once I’ve got the ordereddict Y, I could also use xmltodict to convert it back into XML with little to no effort. Cool?

JSON

JSON has become one of the standard ways to represent data between machines. It’s structured, well understood and it’s mostly human readable. A lot of “newer” systems now use JSON as the default data type. Most RESTful APIs for instance seem to have settled on JSON.

This is where things get interesting. Now that I’ve got XML in an ordereddict, I can use the JSON library to convert it to a JSON formatted string which I can then send along to any system that understands JSON. Or write it to a file, or just stare at those pretty, pretty braces.

Note: If I convert from JSON back to a python structure using the json.loads method, it will actually return a regular dictionary, not an Ordered Dictionary, so the values might appear out of order which COULD, in theory, cause issues with an upstream system, but I haven’t seen that in any of my work.

YAML

Although JSON is “more” readable than XML, it’s still got all those braces and apostrophes to worry about. And so YAML was born. YAML is easily the most human readable of the formats I’ve worked with. It uses white space, dashes and asterisk to denote different levels of the data structure. It’s what is commonly used with Jinja2 templating and Ansible and other cool buzzwords that we all are starting to play with.

Just like with the JSON example above, I can take the Ordered Dictionary and convert it to a YAML format (shown below ) and back again.  The yaml.load method does actually return an Ordered Dictionary.

 

What’s my point?

So the original criticism was “But they have an XML api!!!” right?  Well in these little code snippets I just demonstrated how using python and a couple of readily available libraries ( pyyaml and xmltodict are not native python and must be installed ) I was able to go from XML, to OrderedDict, to JSON, to YAML,  with almost no effort. I could take any of these and convert it to something like a Python Pickle, pull it back and convert it to something else. It really doesn’t matter. I can go from one to another without much effort.

Personally, I don’t like working with XML. I can do it, but I would RATHER work with JSON. But that’s just my personal preference, there’s no technical reason why JSON is superior to XML that I can see. At least not in the implementations and the levels that I’m dealing with.

Just like Bilbo Baggins, I can go from there and back again without worrying to much about the actual format in between because when I”m doing something in python, I’m really looking to be working with a native structure like a list of a dictionary anyways.

Anything that I get from externally, I’m just going to convert into a native python data type, munge away, then I”m going convert it back to whatever data format I need, be that JSON, XML or YAML and be on to the next task.

The actual data is what matters.

As long as it’s structured in a way that I can parse easily, I couldn’t care less how it comes in and how it goes out.

Don’t even get me started about simple wrapping CLI commands in XML…

Does that mean the format doesn’t matter at all?

No, I’m sure there are many more experienced programmers who can explain the horror stories of converting between different data formats, or that time when this thing happened that caused this other thing to blow up.  But for me; I’d much rather you had a well structured API that gives me data in a way that I can easily access, convert to a format I can work with, and move on.

Hopefully if you’ve made it to the end of this blog. You’ll agree that the actual format is much less important that you might once have believed. Disagree? Let me know if the comments below. Always looking to learn something and in the coding real, I ‘know I’ve got a LOT to learn!!!

@netmanchris

 

Automating your NMS build using Python and Restful APIs Part 1 – Creating Operators

It’s a funny world we live in.  Unless you’re hiding under a rock, there’s been a substantial push in the industry over the last few years to move away from the CLI.  As someone right in the middle of this swirling vortex of inefficiency, I’d like to suggest that it’s not so much the CLI that’s the problem, but the fact that each box is handled on an individual basis and that human beings access the API through a keyboard. Not exactly next-generation technology.

 

I’ve been spending lot of time learning python and trying to apply it to my daily tasks. I started looking at the HP IMC Network Management station a few months ago. Mainly as a way to start learning about how I can use python to access RESTFul APIs as well as gain some hands on working with JSON and XML. As an observation, it’s interesting to be that I’m using a CLI ( python ) to configure an NMS ( IMC) that I’m using to avoid using the CLI. ( network devices ).   

I’ve got a project I’m working on to try and automate a bunch of the initial deployment functions within my NMS. There are a bunch of reasons to do this that are right for the business. Being able to push information gathering onto the customer, being able to use lower-skilled ( and hence lower paid!) resources to do higher level tasks. Being able to be more efficient in your delivery, undercut the competitors on price and over deliver on quality. It’s a really good project to sink my teeth and use some of my growing coding skills to make a difference to the business. 

This is the first post in which I’ll discuss and document some of the simple functions I’m developing. I make no claims to be a programmer, or even a coder. But I’m hoping someone can find something here usefull, and possibly get inspired to start sharing whatever small project you’re working on as well. 

 

Without further ado, let’s jump in and look at some code. 

What’s an Operator

Not familiar with HP IMC?  You should be! It’s chock full of goodness and you can get a 60 day free trial here.   In IMC an Operator is someone who has the right to log into the system and perform tasks in the NMS itself.  The reason they use the word operator vs. user is that there’s a full integrated BYOD solution available as an add-on module which treats a user as resource, which of course is not the same thing as an administrator on the system. 

IMC’s got a full RBAC system as well which allows you to assign different privilege levels to your operators, from view only to root-equiv access, as well as splitting up what devices you can perform actions on, as well as segmenting what actions you’re allowed to perform. Pretty powerful stuff once you understand how the pieces go together. 

Adding an Operator in the GUI

 This is a screen capture of the dialog used to add an operator into IMC.  It’s intuitive. You put the username in the username box, you put the password in the password box. Pretty easy right?

If you know what you’re doing and you’re a reasonably good typist, you can add probably add an operator in a minute or less.  

Screen Shot 2015 04 16 at 12 19 17 PM

Where do Operators come from?

Don’t worry. This isn’t a birds and bees conversation.  One of the biggest mistakes that I see when people start into any network management system project, whether that’s Solarwinds, Cisco Prime, What’s up Gold, HP NNMi, or HP IMC, is that they don’t stop to think about what they want/need to do before they start the project.  They typically sit down, start an auto-discovery and then start cleaning up afterwards.  Not exactly the best way to ensure success in your project is it?

When I get involved in a deployment project, I try to make sure I do as much of the information gathering up front. This means I have a bunch of excel spreadsheets that I ask them to fill in before I even arrive onsite. This ensures two things:

  1. I can deliver on what the customer actually wants
  2.  I know when I’m done the project and get to walk away and submit the invoice. 

 

I won’t make any judgement call on which one of those is more important. 

 

 

My Operator Template

My operator template looks like this

NewImage

The values map to the screen shot above exactly as you would expect them to. 

Full name is the full name. Name is the login name, password is the password etc…  

The authType is a little less intuitive, although it is documented in the API docs. The authType maps to the authentication type above which allows you to choose how this specific operator is going to authenticate, through local auth, LDAP, or RADIUS. 

The operator group, which is “1” in my example, maps to the admin operator group which means that I have root-level access on the NMS and can do anything I want. Which is, of course, how it should be, right?

 

The Problem

So I’ve got a CSV file and I know it takes about one minute to create an operator because I can type and I know the system. Why am I automating this? Well, there are a couple of reasons for that.

  • Because I can and I want to gain more python experience
  • Because if I have to add ten operators, this just became ten minutes.
  • Because I already have the CSV file from the customer. Why would I type all this stuff again?
  • Because I can reuse this same format at every customer project I get involved in. 
  • Because I can blame any typos on the customer

Given time, I could add to this list, but let’s just get to the code. 

The Code

Authenticating to the Restful API

Although the auth examples in the eAPI documentation use the standard URLIB HTTP library, I’ve found that the requests library is MUCH more user friendly and easier to work with.

So I first create a couple of global variables called URL and AUTH that I will use to store the credentials.  

 

#url header to preprend on all IMC eAPI calls
url = None

#auth handler for eAPI calls
auth = None 

Now we get to the meat. I think this is pretty obvious, but this function gathers the username and password used to access the eAPI and then tests it out to make sure it’s valid. Once it’s verified as working ( The 200 OK check ). The credentials are then stored in the URL and AUTH global variables for use later on. I’m sure someone could argue that I shouldn’t be using global variables here, but it works for me. :) 
 
def imc_creds():
    ''' This function prompts user for IMC server information and credentuials and stores
    values in url and auth global variables'''
    global url, auth, r
    imc_protocol = input("What protocol would you like to use to connect to the IMC server: \n Press 1 for HTTP: \n Press 2 for HTTPS:")
    if imc_protocol == "1":
        h_url = 'http://'
    else:
        h_url = 'https://'
    imc_server = input("What is the ip address of the IMC server?")
    imc_port = input("What is the port number of the IMC server?")
    imc_user = input("What is the username of the IMC eAPI user?")
    imc_pw = input('''What is the password of the IMC eAPI user?''')
    url = h_url+imc_server+":"+imc_port
    auth = requests.auth.HTTPDigestAuth(imc_user,imc_pw)
    test_url = '/imcrs'
    f_url = url+test_url
    try:
        r = requests.get(f_url, auth=auth, headers=headers)
    except requests.exceptions.RequestException as e: #checks for reqeusts exceptions
        print ("Error:\n"+str(e))
        print ("\n\nThe IMC server address is invalid. Please try again\n\n")
        imc_creds()
    if r.status_code != 200: #checks for valid IMC credentials
        print ("Error: \n You're credentials are invalid. Please try again\n\n")
        imc_creds()
    else:
        print ("You've successfully access the IMC eAPI")
 
 
I”m using this function to gather the credentials of the operator accessing the API. By default when you first install HP IMC, these are admin/admin.    You could ask: Why don’t you just hardcode those into the script? Why bother with writing a function for this? 
Answer: Because I want to reuse this as much as possible and there are lots of things that you can do with the eAPI that you would NOT want just anyone doing. Plus, hardcoding the username and password of the NSM system that controls your entire network is just a bad idea in my books. 
 

Creating the Operators

I used the HP IMC eAPI /plat/operator POST call to as the basis for this call. 

Screen Shot 2015 04 16 at 1 06 21 PM

 

After doing a bit of testing, I arrived at a JSON array which would allow me to create an operator using the “Try it now” button in the API docs.  ( http://IMC_SERVER:PORTNUMBER/imcrs to access the online docs BTW ).

    {
"password": "access4chris",
"fullName": "Christopher Young",
"defaultAcl": "0",
"operatorGroupId": "1",
"name": "cyoung",
"authType": "0",
"sessionTimeout": "10",
"desc": "admin account"
}

Using the Try it now button, you can also see the exact URL that is used to call this API. 

The 201 response below means that it was successfully executed. ( you might want to read up on HTTP codes as it’s not quite THAT simple, but for our purposes, it will work ).

Screen Shot 2015 04 16 at 1 10 46 PM

Now that I’ve got a working JSON array and the URL I need, I’ve got all the pieces I need to put this small function together. 

You can see the first thing I do is check to see if the auth and url variables are still set to None. If they are still None I use the IMC_CREDS function from above to gather them and store them. 

 

I create another variables called headers which stores the headers for the HTTP call. By default, the HP IMC eAPI will respond with XML. After working with XML for a few months, I decided that I prefer JSON. It just seems easier for me to work with.

This piece of code takes the CSV file that we created above and decodes the CSV file into a python dictionary using the column headers as the key and any additional rows as the values. This is really cool in that I can have ten rows, 50 rows, or 100 rows and it doesn’t matter. This script will handle any reasonable number you throw at it. ( I’ve tested up to 20 ).

 

#headers forcing IMC to respond with JSON content. XML content return is the default

headers = {‘Accept’: ‘application/json’, ‘Content-Type’: ‘application/json’,’Accept-encoding’: ‘application/json’}

def create_operator():
    if auth == None or url == None: #checks to see if the imc credentials are already available
        imc_creds()
    create_operator_url = ‘/imcrs/plat/operator’
    f_url = url+create_operator_url
    with open (‘imc_operator_list.csv’) as csvfile: #opens imc_operator_list.csv file
        reader = csv.DictReader(csvfile) #decodes file as csv as a python dictionary
        for operator in reader:
            payload = json.dumps(operator, indent=4) #loads each row of the CSV as a JSON string
            r = requests.post(f_url, data=payload, auth=auth, headers=headers) #creates the URL using the payload variable as the contents
            if r.status_code == 409:
                print (“Operator Already Exists”)
            elif r.status_code == 201:
                print (“Operator Successfully Created”)

 Now you run this code and you’ve suddenly got all the operators in the CSV file imported into your system. 

Doing some non-scientific testing, meaning I counted in Mississippi’s, it took me about 3 seconds to create 10 operators using this method.  

Time isn’t Money

Contrary to the old saying, time isn’t actually money. We can always get more money. There’s lots of ways to do that. Time on the other hand can never be regained. It’s a finite resource and I’d like to spend as much of it as I can on things that I enjoy.  Creating Operators in an NMS doesn’t qualify.

Now, I hand off a CSV file to the customer, make them fill out all the usernames and passwords and then just run the script. they have all the responsibility for the content and all I have to do is a visual on the CSV file to make sure that they didn’t screw anything up.

 

Questions or comments or better ways to do this?  Feel free to post below. I’m always looking to learn.

 

@netmanchris