Hey Alexa, Turn my lab on!

TL/DR Put together a custom Alexa Skill so I can turn switches and routers off in my lab as shown in the video here. Feels pretty great.

 

As most of my twitter followers have noticed, I’ve been doing a lot of Home Automation, mostly with Apple #homeKit. But I also picked up an Amazon Dot because… well why not?

One of the great things abut the digital voice assistance from Amazon, is that they have created an extensible framework that enables those with a little bit of coding skills to add to Mrs. A’s already already impressive impressive array of abilities.

The Amazon Alexa developer page is pretty impressive. There’s a ton of information and tutorials there, as well as an SDK and code examples in Node.js. I’m almost exclusively a python coder at this point, so I decided to look for something a little more familiar and came upon this.

Flask-Ask

Flask-Ask is a Flask extension that makes building Alexa skills for the Amazon Echo easier and much more fun.

Essentially, John Wheeler took the flask WSGI ( web) framework and made it super easy to be able to create Amazon Alexa skills using this familiar library. I’ve used Flask in the past for a few projects, so this was a no-brainer for me.

John also put together a set of tutorials here which can be used to jumpstart the Alexa skills development process. There’s also a flask-ask Quickstart on the amazing developers blog which pointed me towards ngrok which came in really handy!

*Ngrok allows you to create secure tunnels to a local host. You run ngrok with the port number you want to expose and it automatically exposes the host as a resource on the grok website. It’s really really really cool. 

The Project

Like many of us, I have a physical lab in my house from my CCIE studies. As well, specializing in network management over the years requires access to physical gear in a lot of instances. Powering on that gear full time is out of the question because of the cost and power drain. As I’m sure you can imagine, going back and forth to turn things on and off gets old real quick.

To address that problem, I picked up a couple of intelligent PDU’s on eBay. There are many “smart” PDUs out there and I happen to have a set of Server Technologies that allows me to control each socket on a 16 port power bar. Pretty cool, right?  No more walking to the garage, which is a good thing when you’re trying to focus on a problem.

So things are heading in the right direction;  I can pop over to the local web interface of my PDU and turn my devices on and off. That’s nice…. But all the home automation stuff I’ve been doing lead me to wonder…

Why can’t I just ask for the device to be turned on?

I can ask Siri or Alexa to turn on the lights or adjust the temperature of my house. I can ask the about the weather or to check my calendar. There’s no reason why I shouldn’t be able to do the same with my lab gear.

So I decided to make that a reality.

What’s not covered in this blog

The one step which is not covered in this blog is writing the pyservertech library which I built on top of the pysnmp library. Essentially I walked the MIBs until I found how to gather the info I needed and figured out which specific MIB I needed to set to turn an individual power socket on or off.  I might do a blog on that specific piece too, but for now, I’m trying to focus on the Alexa piece.

If there’s interest, please let me know in comments or on twitter and I’ll prioritize the SNMP set blog. 🙂 

Building the Alexa Skill

Alexa skills are a combination of three components

  • Ask-Flask – This is the actual code and includes the templates file shown below
  • Intent_Schema – Kinda obvious, but this includes the various intents that you’re going to use in your skill
  • Sample Utterances – Are a collection of the various verbal phrases and how they are connected to the intents.

I’ll do my best to connect these in the code below, but I’d really recommend going through a couple of the tutorials above and play around with the examples to built some intuition on how these components connect.

The code below let’s a user do the following using Amazon Alexa’s voice assistant.

  1. Ask Alexa to open the Lab skill ( Lab is what I called it )
  2. Alexa asks the user “Welcome to the lab. I’m going to ask you which plug you want me to turn on. Ready?”
  3. User responds with “Yes”or “Sure”
  4. Alexa asks the user “Please tell me which power socket you would like to turn on?”
  5. User responds with a number which is the power socket they would like to turn on
  6. Alexa decodes the response and returns the number in a JSON array to the local Flask server
  7. Python code takes the number from the JSON array and uses that as input into the power_on() function.
  8. Power_On() function sends an SNMP SET command to the appropriate input.
  9. Device powers on.Alexa says “I’ve turned the power socket on.”
  10. I don’t walk to the garage.

Now that we understand how the code is supposed to work, let’s take a look at the individual pieces and how they fit together.

Alexa Skill

This is the python code that you’ll run on your local machine. This contains only a portion of the logic of the “program” as Amazon is really doing the majority of the lifting on their side as far as the speech recognition and returning the appropriate data in a JSON array.

Templates file

This file contains the various phrases that Alexa is going to speak on behalf of your application. You can see we’ve only got a few different.

Intent_Schema

This file gets loaded on the Amazon website.  Using the developer interface, you load the JSON which defines the Intent Schema directly into the intent schema location on the Interaction Model page.

NewImage

Sample Utterences File

Just like the Intent Schema, the Sample Utterances is also loaded directly into the Amazon developer portal into the Interaction model for this specific skill

NewImage

What’s next

This is just the start of this skill. All it does right now is turn things on, which is cool, but I want more. Just off the top of my head here are some of the things I’d like to do

  • Turn individual devices on or off
  • Turn individual devices on or off by name “Alexa ask the lab to turn on the HPE 2920 switch!”
  • Turn groups of devices on or off “Alexa ask the lab to turn on the Juniper branch!”
  • Request data from the PDUs “Alexa ask the lab How many devices are currently turned on?”  Or “ask the lab how much power is currently being used”

As you can imagine, this would require a lot more code and logic to accomplish all these goals. Definitely something I’m going to pursue, but I’m hoping that the simple example above helps to inspire someone else in their journey down this path.

Questions? Comments? You know what to do…

@netmanchris

Amazon S3 Outage: Another Opinion Piece

So Amazon S3 had some “issues” last week and it’s taken me a few days to put my thoughts together around this. Hopefully I’ve made the tail-end of the still interested-enough-to-find-this-blog-valuable period.

Trying to make the best of a bad situation, the good news, in my opinion, is that this shows that infrastructure people still have a place in the automated cloudy world of the future. At least that’s something right?

What happened:

You can read the detailed explanation on Amazon’s summary here.

In a nutshell

  • there was a small problem
  • they tried to fix it
  • things went bad for a relatively short time
  • They fixed it

What happened during:

The internet lost it’s minds. Or more accurately, some parts of the internet went down. Some of them extremely ironic

UNADJUSTEDNONRAW thumb bbfd

Initial thoughts

The reaction to this event is amusing and it drives home the point that infrastructure engineers are as critical as ever, if not even more important considering the complete lack of architecture that seems to have gone into the majority of these “applications”.

First let’s talk about availability: Looking at the Amazon AWS S3 SLA, available here, it looks like they did fall below there 99.9% SLA for availability. If we do a quick look at https://uptime.is/ we can see that for the monthly period, they were aiming for no more than 43m 49.7s of outage. Seems like they did about 6-8 hours of an outage so clearly they failed. Looking at the S3 SLA page, looks like customers might be eligible for 25% service credits. I’ll let you guys work that out with AWS.

Don’t “JUST CLICK NEXT”

One of the first things that struck me as funny here was the fact that this was the US-EAST-1 Region which was affected. US-EAST is the default region for most of the AWS services. You have to intentionally select another region if you want your service to be hosted somewhere else. But because it’s easier to just cllck next, it seems that the majority of people just clicked past that part and didn’t think about where they were actually hosting there services or the implications of hosting everything in the same region and probably the same availability zone. For more on this topic, take a look here.

There’s been a lot of criticism of the infrastructure people when anyone with a credit card can go to amazon sign up for a AWS account and start consuming their infrastructure. This has been thrown around like this is actually a good thing, right?

Well this is exactly what happens when “anyone” does that. You end up with all your eggs in one basket.  (m/n in round numbers)

“Design your infrastructure for the four S’s. Stability Scalability, Security, and Stupidity” — Jeff Kabel

Again, this is not an issue with AWS, or any Cloud Providers offerings. This is an issue with people who think that infrastructure and architecture don’t matter and it can just be “automated” away. Automation is important, but it’s there so that your infrastructure people can free up some time from mind numbing tasks to help you properly architect the infra components your applications rely upon.

Why o Why o Why

Why anyone would architect their revenue generating system on an infrastructure that was only guaranteed to 99.9% is beyond me.  The right answer, at least from an infrastructure engineers point of view is obvious, right?

You would use redundant architecture to raise the overall resilience of the application. Relying on the fact that it’s highly unlikely that you’re going to lose the different redundant pieces at the same time.  Put simply, what are the chances that two different systems, both guaranteed to 99.9% SLA are going to go down at the exact same time?

Well doing some really basic probability calculations, and assuming the outages are independent events, we multiple the non-SLA’d time happening ( 0.001% ) in system 1 times the same metric in system 2 and we get.

0.001 * 0.001 = 0.000001 probability of both systems going down at the same time.

Or another way of saying that is 0.999999% of uptime.   Pretty great right?

Note: I’m not an availability calculation expert, so if I’ve messed up a basic assumption here, someone please feel free to correct me. Always looking to learn!

So application people made the mistake of just signing over responsibility to “the cloud” for their application uptime, most of whom probably didn’t even read the SLA for the S3 service or sit down to think.

Really? We had people armed with an IDE and a credit card move our apps to “the cloud” and wonder why things failed.

What could they have done?

There’s a million ways to answer this I’m sure, but let’s just look at what was available within the AWS list of service offerings.

Cloudfront is AWS’s content delivery system. Extremely easy to use. Easy to setup and takes care of automatically moving your content to multiple AWS Regions and Availability Zones.

Route 53 is AWS’s DNS service that will allow you to perform health checks and only direct DNS queries to resources which are “healthy” or actively available.

There are probably a lot of other options as well, both within AWS and without, but my point is that the applications that went down most likely didn’t bother. Or they were denied the budget to properly architect resiliency into their system.

On the bright side, the latter just had a budget opening event.

Look who did it right

Unsurprisingly, there were companies who weathered the S3 storm like nothing happened. In fact, I was able to sit and binge watch Netflix well the rest of the internet was melting down. Yes, it looks like it cost 25% more, but then again, I had no problems with season 4 of Big Bang Theory at all last week, so I’m a happy customer.

Companies still like happy customers, don’t they?

The Cloud is still a good thing

I’m hoping that no one reads this as a anti-cloud post. There’s enough anti-cloud rhetoric happening right now, which I suppose is inevitable considering last weeks highly visible outage, and I don’t want to add to that.

What I do want is for people who read this to spend a little bit of time thinking about their applications and the infrastructure that supports them. This type of thing happens in enterprise environments every day. Systems die. Hardware fails. Get over the it and design your architecture to take into consideration these failures as a foregone conclusion. It IS going to happen, it’s just a matter of when. So shouldn’t we design up front around that?

Alternately, we could also chose to take the risk for those services that don’t generate revenue for the business. If it’s not making you money, maybe you don’t want to pay for it to be resilient. That’s ok too. Just make an informed decision.

For the record, I’m a network engineer well versed in the arcane discipline of plumbing packets. Cloud and Application architectures are pretty far away from the land of BGP peering and routing tables where I spend my days. But for the low low price of $15 and a bit of time on Udemy, I was able to dig into AWS and build some skills that let me look at last weeks outage with a much more informed perspective. To all my infrastructure engineer peeps I highly encourage you to take the time, learn a bit, and get involved in these conversations at your companies. Hoping we can all raise the bar collectively together.

Comments, questions?

@netmanchris