Subscribe to our newsletter!
January 2017

Data Driven Revolutions

The write-up below isn't exactly the same as the one found in the Data Sketches book. For the book we've tightened up our wording, added more explanations, extra images, separated out lessons and more.


When we first agreed on Music for December, I felt quite lost; I didn't know what I wanted to do, other than perhaps something about K-pop. It wasn't until I was lamenting this lack of angle with Kenneth Ormandy that something he said stuck and I suddenly remembered DDR.

DDR was a huge part of my teenage life. I first saw it in 2001, at a friend of a friend's house. Since there were no arcades with DDR anywhere near where I grew up, I begged my parents to buy me a set. Easier said then done; there were absolutely no video games in our house at the time, so a DDR set meant not only the game itself, but the PS2 and the mat that came along with it. It took me two years to convince my parents, and the summer I got it, I was on it every day for hours. I was the type of kid that played the same song on the same difficulty over and over until I mastered it (with mastery defined as being able to clear the song at least twice in a row). I played it regularly until I left for college 5 years later.

(I should say though that I was never that great, and the highest I was ever able to clear was an 8/10 difficulty, and I most comfortably played songs that were 6 or 7/10.)

When I first went out looking, I was excited for the data out there; with hundreds of songs (thousands?), each with at least three difficulty levels and BPMs and maybe even the values in the Groove Radar (Stream, Voltage, Air, Freeze, and Chaos), that's potentially a lot of data. (By the way, never knew that pentagon was called a Groove Radar, I just learned it - thanks Google.) Never in my wildest dreams though, did I imagine all the steps from 645 songs just there, available. But then I came across the amazingness of DDR Freak and their step charts:

And I started talking to my Computer Vision friend about how I (he) would go about getting the data out of that image. But Kenneth suggested I just email DDR Freak and ask if they had the raw data. And I was skeptical (the site had been largely inactive for at least the last five years) but I found an email (Jason Ko, founder of DDR Freak) and was like, hey why not? Never expecting a response.

And then Jason responds 18 (read: 18!!) minutes later with a zip of all the songs he had on hand. How awesome is he??

His zip had several different data formats, including .dwi (Dance with Intensity) and .stp (their own proprietary format). The .dwi files looked like this:

The file format is explained here, but at the very basic: 0 indicates no arrow, 2 is Down, 4 Left, 6 Right, and 8 Up. Each character defaults to 1/8 of a beat, but (...) indicates a 1/16 step, [...] a 1/24 step, and so on.

With that information, I was able to quickly reverse-engineer the .stp files (which I found to be a bit more reader-friendly, and ended up using):

The only difference I found here is that instead of (...), {...} indicates 1/16, {{...}} indicates 1/24 (though I think in my code I assumed 1/32...), and so on.

Here is my (super straightforward) parsing code, and the resulting JSON file if you ever want to play with the data 😁✌️


Because I was traveling for majority of December, I wanted to do something relatively simple. I had seen teamLAB's Crystal Universe a few weeks earlier, and was incredibly inspired by its beauty:

I wanted to do something similar for my December, where each step was a light, and I could animate it light up based on its position in the song. It started out promisingly, where I mapped the steps for each song's 2 modes (Single and Double) and 3 difficulty levels (Basic, Trick, Maniac) over time:

To distinguish different modes/difficulties from the next row of music, I went back to the music sheet/staff idea from November (you might have to click and expand the image to see the lines since they're so faint):

But then I got really stuck. Yes, with the staff lines I can now distinguish rows of modes/difficulties from the next row of steps over time. But it didn't seem to do me much good, and the steps were so spread apart that I couldn't seem to see the whole song on one screen and see if there were any interesting patterns. I also couldn't figure out how the rest of the interface would fit in, where people would browse and select songs to bring up this song view.


While still stuck, I went to sleep with only two goals for an interface in mind: I wanted each song to be compact, and I wanted the visual analogy for a song to be continuous. I woke up the next day with spirals on my mind, and I was convinced that it was the answer to my problems; spirals like circles were very compact, and a spiral was basically one long continuous line that I could map my steps to.

I giddily set about looking into the math behind spirals, and was super happy to find that for an Archimedean spiral (a spiral where the distance between each spiral branch is the same) the radius was simply the angle:

Until I realized that...if I used that formula, I would end up with something like this:

When instead I wanted the points (the beats) to be equidistant along the length of the spiral, and realized that the problem wasn't as simple as I thought:

(StackExchange: Equation to place points equidistantly on an Archimedian Spiral using arc-length)

And bashed my head because it's been about a decade since I last touched an integral. I spent an hour scribbling in my notebook trying to figure out all the steps in between the equations, so that I could understand how the final equation was derived. No luck.

Then two miraculous things happened:

  1. I posted my plight on Twitter, and Andrew Knauft came back with his own step-by-step derivation
  2. .
  3. I was waiting at a bar at the time, and someone had sat down in the empty seat across from me. I took the chance to ask how his math was (here's the full story), and after a few emails, he (Issac Kelly) had brute-forced the answer.

Because I was already far far behind on my month and completely out of time, I took Issac's brute-force approach and adopted it for the DDR steps:

(Adapted spiral code)

I then added legends for each song that also doubled as the filter (I originally had the legends up top that acted as a filter for ALL the songs, but found it completely unperformant, so switched to filters for EACH song):

Here is the final:

Overall, I'm happy with where I ended up given where I started. With the spiral, I can now compare multiple songs, so both the big spirals (long songs) and teeny spirals stand out. I can also see from a glance that some songs I used to love indeed had a lot of steps close together throughout the whole song with few breaks, so no wonder I was out of breadth all the time. Having said that, I'm not 100% satisfied; for the sake of compactness, I sacrificed being able to see at a glance patterns within a song. One of the funnest things about DDR is that a song will have many sets of steps that repeat throughout; with this viz, it's hard to find those. I'm not sure if, given enough time, I would have been able to find a visualization that could show both trends between songs, as well as patterns within songs.

Either way, my favorite part about this month isn't at all the process or what I ended up with, but rather the kindness of (almost) perfect strangers willing to give a helping hand proving that (some) people are awesome ✌️

When Nadieh and I first agreed on the topic “Music”, I felt quite lost. I didn't know what I wanted to do, except to perhaps explore something related to K-Pop. But K-Pop was just too broad of a topic and I didn’t know it well enough (anymore) to find a good angle. <gutter note>I used to be really into K-Pop in the mid-2000’s.</gutter note> I was lamenting this lack of an angle with a friend when I suddenly remembered the game Dance Dance Revolution (DDR).

DDR originally started as an arcade game in Japan and eventually was released as video games for home consoles. Basically, the player would stand on a mat or a platform that served as the “controller”, and step on any of the four arrows on the mat to “press” the controller buttons. A combination of arrows would appear on the TV screen in front of them, and they would have to step on (or stomp on) the same arrow(s) on the mat, timing their steps with the arrows scrolling up the screen.  It was a popular game across the world and a huge part of my teenage life.

I first came across DDR in 2001 at a friend’s house. Since there were no arcades with DDR anywhere near where I lived, I begged my parents to buy me the game. (This was easier said than done; there were absolutely no video games in our house at the time, so a DDR set meant not only buying the game itself, but also the PS2 and the mat that came along with it.) It took me two years to convince my parents, and the summer I got it, I was on it every day for hours. I was the type that played the same song on the same difficulty over and over until I mastered it, and played it regularly until I left for college five years later.

Read the full write-up in our book "Data Sketches"