Machine Learning Resources

This page is just a place to collect different resources I’ve found as I explore Machine Learning and it’s application specially to networking analytics, infrastructure, control/management plane feedback loops and generally just geeking out on technology and math. It’s pretty amazing stuff if you haven’t gotten into it.

 

This is an incomplete list and I’ll continue adding to it as I get time. Feel free to share links if you’ve got any you found useful! 

 

 Primer

 Youtube video – low on math  youtu.be/b99UVkWzYTQ     < thanks to Jon Hudson  for this!

 

Presentation

Dave Meyer’s Presentation from the DevOps4Networking forum  March 2016 

 

Training

Coursera Machine Learning Specialization using R programming language from John Hopkins Univeristy

Coursera Machine Learning Specialization using Python programming language from University of Washington

Khan Academy has been a great source to fill in some of the gaps around Calculus, Regression, Statistics, etc… 

Introduction to R and SWIRL

So I’m taking a Cousera course from John Hopkins on Data Science.

The course uses the R programming language which is a derivative of the S programming language that came out of Bell labs in the 70’s. I’m a huge believer in network programability and SDN in general. From a traditional  Network Management point of view, most of the work getting done and discussed today is really around the C in the FCAPS model. There are some people, like Jason Edelman, Matt Oswald, etc… who are using network programability for automating troubleshooting tasks, but most of those are pretty straight forward

  • automate information gathering
  • automate troubleshooting
  • Identify the corrective action

Once you’ve got the corrective action nailed down, you could also automate the fix, but there are a lot of people who are still nervous about having changes happen without a human being involved. 

Automating configuration management and configuration based fault detection and error correction are great things. But there are other parts of the network that can benefit from the application of a programming language to old problems. 

I’m personally interested in the massive amounts of data that the network holds. We’ve got a ton of instrumentation within the network that is just setting there to be accessed, tracked, and mined for useful insights. 

Data Science is all about different methods to scroll through all the data in a scientifically reproducable manner, hopefully gain some insights.

The Tools

Like python, R has an IDE available that will allow you to run R code interactively, or through R files. It can be downloaded at the CRAN site here

There’s also a better IDE available called R Studio that allows some additional functionality which is available here 

SWIRL is a library which allows learners to access some interactive tutorials written in R for R. There’s a GIT repository here which provides a set of tutorials for different courses that allows you to get a feel for the language syntax, creating functions, etc…  

 

R Studio and Swirl

Once you install the SWIRL library, which is really easy using the RStudio Install Packages feature, you load the SWIRL library ( think IMPORT in Python ) using the library(swirl) function. Once you’ve done that, you can either download the course files from the GIT repository, or you can install directly from within R ( uses CURL in the background to download the files directly into your working directory ). 

As you can see in the screen capture, I’ve got a few different course installed, and each of the courses has a bunch of lessons inside them. The screen capture shows the lessons within the R Programming course. What’s also cool about this is, assuming that you’re enrolled in the Coursera R Programming course, you can complete the lesson, input your username and password ( specific to your course, not your cousera password ) and magically, you get extra credit for the course lessons you complete.   

Extra credit is a good thing.

 

Wrap Up

I’ve only been into R for about a week. It’s got some nice features, but to be honest, I don’t have enough coding experience to really give a qualified opinion on the subject. I’ll continue to work with it and see where things go.  There’s still a ton of python that I need to learn, but I’ve already found a native python library called rpy2 that allows me to access native R libraries from within my python code. Best of both words I guess. 🙂