February 22, 2013

Coursera Data Analysis MOOC: Half-Time Entertainment

It's week five of the Coursera Data Analysis MOOC, and it's been a busy time since the course commenced (see my first impressions). I've just completed the weekly quiz so have time to come up for air and post a progress report.

The course is similar in many respects to my time at uni: I've been late to lectures, and had to scramble to complete tests and assignments. As I mentioned previously, one of my main motivations was to learn about MOOCs. So, I wasn't too worried when by the middle of week one I hadn't been spoon-fed course material. I'd expected a flurry of emails with course information but my inbox was quiet. In fact, I needed to actually visit the Coursera Data Analysis Web site to attend class. By the time I did I found the course well under way, and I had a lot of catching up to do.

I also realised that, just like uni, turning up wasn't going to be enough; I needed to invest serious time in understanding what was being taught and applying it to tests and assignments. So, I pulled my finger out and put aside some other projects to clear time each week to devote to the course.

Having a weekly quiz with a hard deadline has been a useful motivator. It would have been easy to chuck it in - after all, enrollment is free - or let things slip until I had more time. With the quiz deadline I have a weekly goal that keeps me working on the course each day.

I've just completed the first assignment. It was an interesting project focussed on a data set from the Lending Club; a peer-to-peer loans service. We were given two weeks to submit our work. Following this we had a week to mark at least four of our peers' assignments (failure to do so applies a 20% penalty to your own assignment). We were provided with a simple assessment template to guide us through marking.

This is the first time Coursera has presented the Data Analysis course, and there have been a few hiccups along the way. Lecture notes included a few typos, scheduling of deadlines needed to be fine-tuned, and the requirements of the assignments were changed due to security issues (running a stranger's R code is inherently risky).

Many of the changes have come about from feedback via the course forum. I've not had much time to participate in the forum other than occasionally scanning the top-voted posts.

I've found the course material challenging and rewarding. It's clear that data analysis requires a strong grounding in statistics. Prof. Leek has provided us with a tool kit for data analysis: techniques and how to apply them using R. However, an explanation of the underlying mathematics is not covered (the course is only eight weeks). Prof. Leek has provided links to further resources that provide this background information but I haven't had time to delve into this material.

That being said, I am becoming more proficient with R, which is useful in my day-to-day work. And I have gained a better understanding of the techniques available to me for data analysis work.

I'll post another update at the end of the course in March.