August 25, 2011

Top-20 Worst Passwords

In December 2009 a social media company experienced a data breach that exposed details of 32 million user accounts. stored  users' passwords in plain text!

The stolen passwords were published (with no other identifying information) on a blog.  These were used by security company Impreva in a white paper analyzing the strength of the passwords.  The white paper lists the top-20 weakest passwords and the frequency with which they were used.

I've been planning to use IBM's social visualization Web site Many Eyes for some time, and the RockYou passwords data is my first foray. I extracted the passwords from the Impreva white paper and created this data set on Many Eyes.

I then created two visualizations using the data set. The first is a simple histogram:

and the second is a word cloud:

What insights do the visualizations offer?
  • Firstly, it's obvious that simple numeric sequences dominate
  • As seen elsewhere password and iloveyou were popular with users
  • Forenames of celebrities also appear to be popular: michael (jackson?), jessica (simpson?), nicole (richie?), daniel (radcliffe?), ashley (tisdale?)
  • The Web site's name rockyou was commonly used
  • Passwords as short as five characters were permitted (apparently non-alphanumeric characters were not allowed!) 

These simple visualizations are useful for providing quick insights into what is fairly simple data.

I found Many Eyes easy to use and am looking forward to seeing what else is possible with it.