Data Sketches - Cardcaptor Sakura

I am starting to understand that the topics that I can get really enthusiastic about are a bit of a niche. That not many other people truly know it. I can only hope that for the people who are fans, as I am, this project will be a joy to explore. The topic that I've chosen for Fearless is "Cardcaptor Sakura"!

A magic-girl manga (i.e. Japanese comic) from about 20 years ago. It was the first manga I owned, when manga was practically unknown in the Netherlands (still is actually (╥﹏╥) ). I even had to travel all the way to the biggest city of our province to buy a new volume. I'm still dreadfully jealous of how perfect and cute practically each panel of the manga looks. I almost chose this subject for our Nostalgia topic, but eventually went with Dragonball Z then. Nevertheless, after 20 years a new "arc" of Cardcaptor Sakura has started again recently. Therefore, while thinking of what to do for our Fearless topic, I just couldn't shake the feeling of wanting to do something with Cardcaptor Sakura (CCS), a brave, fearless, and kind magical girl.

One of my favorite things about CCS is how beautiful the writers, CLAMP, make each page. Especially the covers from each chapter, which are tiny works of art (the image above is the cover for chapter 23). I therefore wanted to investigate the covers through data somehow. And I've never before done any kind of analysis based on image data. I therefore thought that creating a visualization that would abstract the colors of each cover into 3 - 8 colors would be fun and new for me.

Data

There are 50 chapters in the original CCS manga, divided into two "arcs". I went through all 12 CCS volumes to see what image was on the cover of each chapter. In my volumes the covers are printed in black and white. However, all of those chapter covers have since been published in full-color in several CCS art books. I therefore searched for and downloaded the corresponding color image from the CCS Wiki page.

All 12 manga volumes of Card Captor Sakura

Using the imager package in R, I loaded the images into R where each pixel was transformed into a multidimensional array of RGBA values. I converted that complex array into a more simple data frame of (number of pixels) * 3 (for r, g, and b value) size. To figure out which algorithm would cluster the pixel values into decent colors groups I tried several things. First I experimented with using different clustering techniques: from the standard K-means, to hierarchical clustering and even tSNE. But I also converted the RGB values of each pixel into other color spaces (where colors have different "distances" to each other and can thus result into different clustering results), using, amongst others, the colorspace package.

I often converted the results of each test into a bar chart such as below to see the color groups found. Eventually, I found that using K-means together with the colors converted to "Lab" visually gave the best fitting results.

Color distribution of the first CCS chapter using Kmeans

However, one of the tricky things with K-means is to figure out how many clusters should be used to create groups. I first tried a combination with hierarchical clustering, but eventually I decided to use something that was probably a better judge (but more time consuming), my own eyes! For each chapter I created a graph as below, that shows me the color distribution for 3 color clusters, up to 11 (I didn't want too many). I then compared the actual cover to these groups and chose the best fitting one; a balance between capturing all the colors and having a good blend of distinct colors. I saved the hex colors and %s (the height of the bars) of the best clustering into a json.

Color distributions of several Kmeans results with different number of clusters

To complement this data about the chapter covers, I also wanted to gather information about which characters appeared in each chapter and which "card" was captured in which chapter (CCS is about Sakura collecting so-called magical Clow cards). The CCS Wiki page on each chapter seemed like just the resource I needed. But sadly, only the first 8 chapters contain information. Well, nothing else to do but to read all chapters again myself while slowly filling an Excel file with the info I needed ¯\_(ツ)_/¯

Due to the "layered" visual result of this project, I eventually sliced and diced all of this information (who is in each chapter/cover image, totals per character/chapter, relations between characters, color distributions, and more) into about 7 separate small files throughout the creation process. I prefer to prepare all data beforehand in R. I find that the easiest and as a bonus, it doesn't clutter my JavaScript code.

Sketch

Figuring out the design for this project came slowly. It was more a domino effect. A more concrete idea for one aspect led to a vague idea for the next part of the data which I then explored. I started with how to visualize the colors. Having a cluster of small colored circles per chapter seemed like a logical/interesting step. Placing the color clusters in a radial layout was also an obvious choice after that. Although at first I wanted to do a semi circle, with the color clusters to the right and character info the the left. However, with 50 chapters, I really needed as much space as I could get. So that's why I went with the layered approach of a small circle for all characters and around that another circle with all chapter color clusters. These two circles would then be connected by lines to show which characters appeared in which chapters.

I've always been fascinated by the CMYK dot printing process; where you can see the separate dots when you're looking at it up close, but move farther back and the bigger picture comes into view (I guessing I'm not the only one who could sometimes be found with her nose literally touching an old magazine, or old TV (for the RGB stripes), right....?). Recreating this CMYK dot technique for a visualization about a (printed) manga seemed like a proper style, and challenge. And challenge it was! I won't go into the details here (you can read a bit more in the code section), but below on the right page you can see some scribbles I made to understand how to recreate the CMYK effect (it has to do with rotations...).

Sketch of trigonometric functions to figure out CMYK dots

Another mathematical challenge for this project was for something that I didn't use in the end... At first I wanted to connect the inner circle of characters with the outer ring of chapters with swirling lines (to say it technically, two connected SVG Cubic Bézier Curves). Making sure that these lines would always flow around the inner circle and look good, took a lot of time and notes in my sketchbook! I always prefer to draw the approximate SVG path shape I have in mind to then try and figure out where new points and anchor points should be placed. The really hard part is to then understand how these points and anchor points change when the data changes; how to create a "formula" that works for all instances.

Sketch of the swirling lines between characters and chapters

The page below shows specifically how to handle the calculation of a tangent line to a circle for different circumstances. I needed this information for those swirly lines from above. But even though I ended up with different lines, I could thankfully use part of the things I'd figured out on these two pages to easily convert the lines to what became the final result (the more circular running lines).

Code

I first focused on getting the ring of cover color circles on my screen. Mostly because I wanted to see how that CMYK idea would look as soon as possible. And thanks to this excellent example of multiple points of gravity by Shirley, that was actually a piece of cake! But damn, those circles had to become quite small to make room for all 50 chapters. I was starting to get my doubts if the CMYK effect would work as well in this particular design as I'd hoped...

All the main colors of the 50 chapters clustered

Nevertheless, I first went to one of Veltman's amazing blocks in which he already neatly coded up a CMYK dot effect as SVG patterns (btw, I've started collecting my favorite d3.js blocks in a Pinterest board, so I have a visual 'bookmark' for each, which makes for easier retrieval). Rewrote that to create a separate pattern for each color and I had myself a ring full of CMYK based colored circles. But on closer inspection I found something I didn't quite like. Although the circles on the inside looked exactly how I wanted them, they were still SVGs. So they had been perfectly clipped into a circle. Truly like a pattern that you cut off. But I wanted my CMYK dots to smoothly fade out, not abruptly. But I also wanted to play with the idea of partially overlapping the circles, and having the colors mix even further, which wasn't possible with this technique.

I therefore did a wide search online looking for other examples. I already expected that using HTML5 Canvas was probably the way to go. And I did find two interesting options here and here that took me a good 3 - 4 hours to wrap my head around and combine into one. It took a lot of testing an tweaking...

But eventually I got the visual options and look I was going for. First, smooth edges. By which I mean that the dots get smaller around the sides, but all the CMYK dots are still full dots. But also that I could plot the circles on top of each other with the CMYK effects of both circles visible.

Final canvas CMYK circles than have smooth edges and can overlap

And then I applied it to all the circles from the 50 chapters and..... as expected the circles were so small that there wasn't "enough CMYK" going on. It was sometimes a bit hard to actually get a feeling for the true color of circle because it contained only a few CMYK dots in itself (ー_ー﹡; )

Well, that's how (dataviz) design sometimes works, hours of work on something that never makes it to the end. So I converted a simple version into a block to perhaps use for another time and took a closer look at the original SVG version again. What to do about those crisp outer edges? ... Hmmm ... What about adding a thick stroke? And yup, that fixed it enough for me, haha (*^▽^*)ゞ

I thought the lines between the inner and outer circles would probably be the next most difficult thing to tackle, but to properly do that I first needed my inner circle. Whipping up the thin donut chart from my sketch was straightforward with d3's arc and pie layout. Still, I felt I first wanted to see if I could get the connections/relations between the characters "visually working" on the inside before I moved to the outer lines. Because if it didn't work I might have to think of a different general layout.

Time to write some custom SVG paths again! Below you can see the progress from the simplest approach, straight lines in the top left, to the final version (shape wise), in the bottom right. The final version is made up of circles, using the SVG arc command, re-using code that I had written for the small arcs in my project about Fantasy books.

I then colored the lines according to the type of connection (e.g. family, love) which made me see that there weren't too many lines in there to get insights from, no visual overload, pfew. Alright, then it was really time to dive into those outer lines...

To most extreme lines that I could create would run from a character to a chapter that's on the other side of the circle. The line would then have to swirl around the inner circle, without touching any of the other character's names. I thought I could probably pull that off by combining 2 Cubic Bezier curves. But making 1 of those curves act as you want, depending on the data, can be a hassle. And I found out that 2 was more than twice the hassle o_O

With difficult SVG paths I always start out with placing small circles along the line path itself (the red one in the center below) + the anchor points (the blue, green and yellow-orange one. The pink one I placed for another reason that's too technical for me to explain here, hehe)

Placing the SVG path anchor points to understand the line movement

After some manual tweaking with fixed numbers I had a shape for the longer line that I liked. I saved those settings and did another one, a short line. I then inspected how all of the settings changed between the two options. This gives me hints on how to infer several formulas that will hopefully create nice looking lines, no matter the start and end points. But, like I said, that wasn't as easy this time as I'd hoped...

Understanding how the SVG path anchor points move for a shorter line

Ugh, I don't even want to really think back on what journey eventually led me to have the lines I needed. Most of the notes in my sketches section are about this part. Because slightly different things need to happen when the line moves counterclockwise instead of clockwise. And if you mess it all up completely, well...

Over many, many hours of testing, drawing, thinking and fiddling did I inch closer to having all the lines at least sort of going around the center. Although here the finer details of the lines were still quite.... odd...

Weird looking, but generally correct lines

I didn't make a note of how long this particular section of "creating the lines" took, but my guess is somewhere between 8 - 10 hours. After which I was left with the following when I visualized all of the lines (they represent the chapters that each character appears in):

Awesome, that looked like one big mess! There were too many lines in there to glean any insight. That would mean I'd have to create some sort of hover interaction that only shows you a subset of the lines when you hover over a character or a chapter.

Ugh, enough time spend on those lines for now. I therefore moved on to adding the chapter and volume numbers. The inner donut chart inspired me to try something similar for the chapters as well. A donut chart with 50 equally sized, rounded-off sections, in which I could place a number. I was happy with the end result and could quickly move on.

For the volumes (typically a collection of ±4 chapters) I started out with the same idea; a donut chart, but made even thinner, which I placed outside the ring of circles. Hmmmm, wasn't as happy with that, but in the meantime something else was bothering even more. Now that I'd placed more elements on the page and all of it seemed to get a sense of "consistency", those inner lines just felt way off σ_σ

Perhaps they should have more "body", by making them tapered, as I did in my visualization about Dragon Ball Z? That again took more time tweaking my line formulas... Although I think it did improve things over the same-width lines I had before:

Adding the chapter and volume indicators

And in of itself it had something nice going on when I implemented the hover interactions. Such as seeing who was in a particular chapter...

Tapered central lines for all characters that appear in the hovered chapter

...or when hovering over a character to see the chapters they appeared in...

Tapered central lines for all chapters that a characters appears in

... However, I just felt that it didn't fit the rest of the visual in terms of design. Ugh! What to do about it!? And all those hours of work! (╥﹏╥)

And suddenly, very quickly after knowing that my swirly lines just weren't right, I had a new idea for the lines. I don't even remember what inspired me, it seemed to come out of nowhere (although that's never truly the case). Instead of making them swirl around, I could also make them run along circular paths. A bit like subway lines or piping in a home, but then transformed to some radial layout

This idea was actually quite easy to execute. I could loop over each line to be drawn, create a tiny array of [radius,angle] points and feed that to d3's radialLine function. Together with setting an interpolation function to curve the edges just a bit. Calculating the small array of radii and angles to feed to the d3.radialLine was a walk in the park compared to my previous cubic bezier curve shenanigans! (but actually enjoying to solve geometry puzzles does help). Naturally, all that didn't go right on the first try. And thankfully I could use some of the work I'd done with the swirly lines. The screenshots below were all made within 1 hour, not bad in terms of progress.

Different steps in the process of creating the 2nd iteration of lines

O, and then I converted all those lines to HTML5 canvas by using the extremely useful .context option that is available in many of d3's drawing functions (such as d3.radialLine). That made things run more smoothly on the hovers!

Final look of the eventual lines between the two circles

With that change of lines, I felt that, when hovering over a character or chapter, the resulting lines fitted perfectly with the "straight-roundedness" of the rest (I hope that made some sense, hehe). And as an added benefit, no more lines were overlapping!

The observant person will notice that I've actually implemented two slightly different line drawing "types". One is drawn by default when you're not hovering, but also when you hover over a character or color circle group. But another is used when you hover over a chapter. Try it for yourself and see what changes.

The possible interactions in the final visualization

Since part of the visualization was about the covers of the chapters. And because I knew most people that would land on the page would probably not know about CCS, I wanted to incorporate some of its imagery. And I just so happened to have a nice large circular area in the center. It took a while to manually "cut-out" a good looking square image from all 50 chapter covers. But at that time I was in an airport and on a plane anyway (coming back from a great night at the Information is Beautiful Awards where Data Sketches won GOLD in the "Unusual" category!! Woohoo!).

Hovering over Sakura reveals an image of her

With all these elements and layers of information in the visualization, I really needed a legend. After having written lines and lines of code to create custom legends in two recent client project (such as this one for Article 19), I wasn't in the mood to do that again. Therefore, I created my legends in Illustrator instead. That saved a lot of time over creating them through code.

Legends explaining how to read the visualization

I initially placed these below the visualization. However (as I was getting used to in this project) these were not the final legends...

But let's not get ahead of myself. Now that the chart itself was nearly finished, I focused on general page layout and annotations. For the layout I had a lot of trouble coming up with something that looked even remotely interesting. To be honest, I'm still not that happy with the final result, but I just can't really design a webpage in itself, just data visualizations ʕ•ᴥ•ʔ

But then having to make that layout work on both mobile and desktop.... Not something I enjoy or want to recount here. Just know it took effort and time.

Early iterations of the general layout and annotations

While reading through all the chapters I took some notes of interesting story points that I wanted to highlight. Using Susie Lu's excellent d3-annotation library adding these around the circle was quite straightforward. Especially with the super handy editmode, that let me drag the annotations around, see where I wanted them and then add those locations hardcoded back into my code.

But after I had placed all the annotations (see the right visual in the image above) I wasn't happy enough with the result. Typically I love the lines that run from the point that you want to annotate all the way under the lines with text. But here that was getting too much visual weight.

I therefore wanted short lines that would radiate outward from the main circle, and place the annotation around those, typically centered. And although the annotation library gives you a lot of freedom, that particular design isn't in there. So instead, I created my own lines, and then used the editmode (see the small dotted circles below, those you can drag around) to position the annotations exactly where I wanted them.

After I felt the visualization was ready enough, I shared it with some friends to ask for feedback and got great suggestions to improve on the interactivity understanding. But one also gave me a great example of a better legend, one that would show the visualization with its rings and explain what each ring truly meant. That was much better than those 3 separate ones that I had before. So I drew a new legend in Illustrator.

And after all that time and effort I finally had a visualization to share with everybody :) which can be found here An ode to Cardcaptor Sakura

Cardcaptor Sakura - Fifty chapters of adorable cuteness

This project took me 86 hours to create. However, a lot of that time went into things that were not used in the final result. Such as ±5 hours on a CMYK canvas based dot effect, or ±15 hours on swirly lines and also ±6 hours on a page layout that isn't even really visible on the final page (except when you press "read more"), or ±2 hours on stupid Chrome bug about horizontal scrolling (and how to come up with a "fix")

Still, I'm quite happy with how the visualization turned out :) I feel I haven't quite seen a similar radial visualization like it before. And Data Sketches is all about experimenting with new ideas, so it's always a joy when one turns out good ^_^ I hope you enjoy interacting with the visualization, even if you've never heard of Cardcaptor Sakura before (it's amazing!)

‍

I’m starting to understand that the topics which make me really enthusiastic are quite niche topics. Not many other people know it. I can only hope this visual will be a joy to explore. I chose to dive into “Cardcaptor Sakura”.

Cardcaptor Sakura (or CCS) is a magical-girl manga (a Japanese comic or graphic novel) that was released about 20 years ago and revolves around a kind, brave and fearless girl called Sakura. It was the very first manga I owned and I’ve read it dozens of times. I picked up CCS because the cover looked like the most beautiful comic I had ever seen, and basically every panel within was just mind-blowingly perfect and cute.

Even though it’s been more than 20 years since the last chapter was published, I’d recently learned about a new “arc” of CCS that was released and thus it was on my mind again. One of my favorite things about CCS is how beautiful each page looks. The covers of each chapter are tiny works of art. I wanted to investigate those covers through data somehow, since I had never done any kind of analysis based on image data before. I loved the idea of creating a visualization that abstracted the colors of each cover into a few colors and thought it would be a fun way to explore and learn.

Fearless

November - December 2017

Cardcaptor Sakura

Data

Sketch

Code

Read the full write-up in our book "Data Sketches"

All Fights from Dragon Ball Z

One Amongst Many

All Fights from Dragon Ball Z