So I’m taking a Cousera course from John Hopkins on Data Science.
The course uses the R programming language which is a derivative of the S programming language that came out of Bell labs in the 70’s. I’m a huge believer in network programability and SDN in general. From a traditional Network Management point of view, most of the work getting done and discussed today is really around the C in the FCAPS model. There are some people, like Jason Edelman, Matt Oswald, etc… who are using network programability for automating troubleshooting tasks, but most of those are pretty straight forward
- automate information gathering
- automate troubleshooting
- Identify the corrective action
Once you’ve got the corrective action nailed down, you could also automate the fix, but there are a lot of people who are still nervous about having changes happen without a human being involved.
Automating configuration management and configuration based fault detection and error correction are great things. But there are other parts of the network that can benefit from the application of a programming language to old problems.
I’m personally interested in the massive amounts of data that the network holds. We’ve got a ton of instrumentation within the network that is just setting there to be accessed, tracked, and mined for useful insights.
Data Science is all about different methods to scroll through all the data in a scientifically reproducable manner, hopefully gain some insights.
Like python, R has an IDE available that will allow you to run R code interactively, or through R files. It can be downloaded at the CRAN site here
There’s also a better IDE available called R Studio that allows some additional functionality which is available here
SWIRL is a library which allows learners to access some interactive tutorials written in R for R. There’s a GIT repository here which provides a set of tutorials for different courses that allows you to get a feel for the language syntax, creating functions, etc…
Once you install the SWIRL library, which is really easy using the RStudio Install Packages feature, you load the SWIRL library ( think IMPORT in Python ) using the library(swirl) function. Once you’ve done that, you can either download the course files from the GIT repository, or you can install directly from within R ( uses CURL in the background to download the files directly into your working directory ).
As you can see in the screen capture, I’ve got a few different course installed, and each of the courses has a bunch of lessons inside them. The screen capture shows the lessons within the R Programming course. What’s also cool about this is, assuming that you’re enrolled in the Coursera R Programming course, you can complete the lesson, input your username and password ( specific to your course, not your cousera password ) and magically, you get extra credit for the course lessons you complete.
Extra credit is a good thing.
I’ve only been into R for about a week. It’s got some nice features, but to be honest, I don’t have enough coding experience to really give a qualified opinion on the subject. I’ll continue to work with it and see where things go. There’s still a ton of python that I need to learn, but I’ve already found a native python library called rpy2 that allows me to access native R libraries from within my python code. Best of both words I guess. 🙂