A Breathing Earth

week 1 | data

I was reading the quarterly magazine of the Dutch branch of the World Wildlife Fund (WWF), which I get for being a donor. Suddenly I wanted to make a visualization that related to something the WWF might do. I "pitched" the idea to Shirley and we went back and forth a bit on what general topic would work for both of us. And then Shirley found her angle, the data visualization survey that had just come out, which gave us our topic of "community".

At the start of April I asked twitter for help on datasets that one might associate with the WWF and got a whole lot of links (thank you very much dear followers!). However, due to being in the US the entire month, doing conferences & meet-ups and creating the presentation that Shirley and I gave at OpenVisConf in Boston on April 24th (which was another amazing conference!), I didn't get to do anything with the links until I was waiting at the gate of my flight home to Amsterdam on April 26th.

I received a lot of tracking data links, either animal or buoy in the water. But I noticed that the search functionalities of these data repositories is aimed at researchers. I could search datasets based on the id of a paper or name of a scientist. But I couldn't request all tracking data of, say, whales... Another type of dataset that was very prevalent were the choropleths, filled regions on a map, representing things such as protected areas, or animal habitations zones.

I started to meander through the links, going down the "Earth" related datasets, and I don't know how I got there, but at some point I found myself on the website of NOAA STAR, the Center for satellite applications and research. Again, just clicking around, and then I came across an image of the Earth, colored by vegetation health. STAR calls it "No noise (smoothed) Normalized Difference Vegetation Index (SMN)" or "Greenness" for short :). This is what STAR had to say about the data: "[Greenness] can be used to estimate the start and senescence of vegetation, start of the growing season, phenological phases. For areas without vegetation (desert, high mountains, etc.), the displayed values characterize surface conditions."

STAR map showing vegetation health for week 1 of January 2016

And they had a map like this for every week in the year of 2016. There was also an option that showed all the maps in a very rough animation (as in, going through 52 images within a minute or two). Even though the animation was crude, and the color palette, well, not optimal, I really liked seeing the changes of vegetation health throughout the year. I wanted to visualize the same thing, but do it in my own style. To show a continuously "breathing Earth".

Furthermore, after seeing what Dominikus Baur did with the Pixi.JS library, while we were both presenting at the INCH conference last March, I knew I wanted to try it out as well. I just needed a project where I had to visualize a lot of points. And this idea seemed like the perfect opportunity to give Pixi a try, since there are often thousands if not millions of "pixels" in an image.

I was very happy to see that STAR also shared the data behind the images. However, I had never worked with these levels of sophisticated geo-data before; hdf and GeoTiff files. Thankfully, I had just seen the wonderful presentation of Rob Simmon on GDAL (the Geospatial Data Abstraction Library) at OpenVisConf. And according to Google, GDAL should be able to open these kinds of files. I followed the quick and hassle free installation steps as outlined in Rob's blog. However, instead of trying to parse the files in the command-line, which Rob's talk was about, I took to Google again to see if there was an R package instead. And of course there was! The appropriately named rgdal.

After also getting rgdal to work (you can read my steps at the top of this R file preparing the data), my next few hours were filled with understanding how to read in a GeoTiff file, what it contained, how I could play with it and finally how I could map it (& how to switch map projections!). These blogs really helped to understand the different aspects of working with the data: 1, 2 and 3 all from the wonderful website of neon data skills. My first goal was to recreate one of the images from the STAR website, so I knew I had done and understood the steps. It took about 6-8 hours, but I have to admit, even with the sub-optimal color palette, I think the image below is just amazing in its detailed nature (just check out the bigger version by clicking on the image below).

Recreated map in R

Great, but these images were about 22 million pixels/datapoints, per week! There was no way I could load that amount of data into the browser, and do that 52 times. I therefore had to create lower resolution files. I did some tests and eventually reducing the resolution to about 50 000 (non-water representing) pixels looked like a good middle ground. Small enough for the browser to handle/read, but high enough to still see interesting details.

As a note to somebody who I had this particular talk with: Yes, I could've made many highly detailed images and then turned these into a mp4 movie. Maybe that would've given the smallest file size with the highest resolution. But! That wasn't my goal this month! I wanted to learn something completely new; WebGL (or libraries build on top of WebGL) and this seemed like a good, but interesting starting project. Therefore, out of a purist approach, I wanted the eventual visualization to consist of actual numeric data read in and then made visual as thousands of small circles.

My next challenge was to think of a way to save the data in the smallest file(s) possible. Even 52 weeks of 50.000 datapoints each week is going to take a several megabytes. During my time trying to understand the file & data I noticed that > 90% of the weeks contained exactly the same number of datapoints. After some more investigation I saw that the x and y locations of these points were exactly the same. The first four weeks in the year had missing data, but they didn't contain any x-y locations that weren't in the other 48 weeks. I therefore made a separate file containing the x and y variable, and 52 separate files only containing 1 variable; the level of vegetation health. The row number would then connect the vegetation health value to the correct x and y location on the map (you can find all the data here). This made the file size of 1 week (i.e. 50.000 values) about 250 kB (I also checked out gzip'ing the files, but I couldn't find a way to then unzip them in the browser. Sad, because the files were reduced to only 35 kB)

week 2 | sketch

Sketching was super simple this month, since my idea was very simple. I wanted to turn the image/pixel based data about vegetation health into thousands of circles. And these circles would animate through the 52 weeks of data, giving the idea of pulsation; getting bigger and darker, although more transparent, when more healthy and smaller, more yellow for low values. And it was just going to be these circles, no other types of mapping "markers" such as country borders, our Earth is beautiful in itself :) I really had to do the more design based aspects (colors, sizes, etc.) with all of the actual data on my screen, so I didn't even brainstorm about more details during the sketch. Instead, most of the page below is filled with ideas on how to create the final datasets. How to make the files as small as possible (in the end I created even more minimal versions that I jotted down in the bottom right section of the left page).

The only sketches made for the project

week 3 & 4 | code

I started out getting the data on the screen with HTML5 canvas. I knew that d3's standard approach of SVGs was definitely going to fail here, since I also wanted to implement transitions. Therefore, I didn't even try SVGs (however, I always use d3 for things such as scales, colors, the stuff to prepare the visual). Thankfully, canvas is quite straightforward, having done a few other projects in canvas over the past year (like February's Marble Butterflies). And I only wanted to plot circles at certain locations, so I got that working for 1 map/week's worth of data quite easily. Below are some steps in the process; first just the circles in the right location (all having the same size and color, but differing opacity); adding colors; adding a multiply blend mode and different circle sizes.

Three steps showing the creation of the map in canvas

I then made a simple interval function that would switch between the 52 maps as fast as it could. So not even animating the whole. And as expected, that took about 2-3 seconds per map. Definitely not a "frame rate" that I could use for natural looking animations.

Therefore, I dove into Pixi. I opened up a whole bunch of examples, especially those that I could find on blockbuilder.org, combining d3 with Pixi, like this one by Irene Ros. I started out using PIXI.Graphics, but let me spare you any more details of the code, in short, it was surprisingly easy to pick up (especially compared to regl and WebGL that I went into later o_O). However, there were some weird pixel rounding "things" going on...

Strange effects in plotting circles with Pixi

And it was slow...

I couldn't really find a solution to making Pixi faster for my specific case through Google, so I did the next best thing; I asked it on Twitter

Asking my question on twitter

And damn, I was so amazed by all of the people providing ideas! Even when I explained more, or asked more, practically all would reply and even share some sandbox examples 😀 - I had several conversations going on, some of them focusing on using regl or ThreeJS, but I also got some interesting ideas on Pixi. For example, I learned that to get the fastest performance with Pixi (even when it uses WebGL) you have to use something called "Sprites". You can sort of see this as small "images". This example with bunnies shows it quite well, you can have hundreds of thousands of the same 5 "bunny" images bouncing around. Or this example with the same green square png moving around. But I didn't have images, I had thousands of slightly different circles. But then I got the following tweet from Matt DesLauriers

The idea to use a white circle png in Pixi

Which I tried and it worked and it seemed fast enough! However, when you looked closely the circles weren't very circular. They looked rather pixelated, bummer...

Pixelated circles with Pixi Sprites

And that's when I decided to give regl a try. I saw a very inspiring presentation by Mikola Lysenko at OpenVisConf featuring bouncing bunnies. And then when Peter Beshai uploaded this block that animates 100.000 points with regl (which is also mesmerizing to look at) I knew it was enough to get started with. A bit later Ricky Reusser send me the same demo, but coded slightly different, for those interested.

At first I hoped to get the hang of regl by going through the above (and other) examples. But after an hour or so I acknowledged the fact that I really didn't understand anything yet and that I had to read some introductions to WebGL, GLSL and shaders, hehe 😅 - It took a while, but my brain slowly started to wrap it's head around the concepts of shaders, fragments, vertices, attributes, uniforms and varyings (some sites that helped me 1, 2, 3, 4 and of course the Book of Shaders (although I only skimmed through the first 2 chapters)).

I started out with this very simple example block by Adam Pearce which creates a triangle. I then slowly starting adjusting it, relying heavily on Peter Beshai's block, to show circles on a map. Some things could stump me for quite some time, like opacity not acting like I expected (lower left image)...

Several steps of getting the map working in regl

It wasn't only Pixi that had it's difficulties in placing the circles on the map without strange effects. I got the below interesting circular-ish pattern in regl at some point. I eventually fixed the whole issue by making sure the size of the map would be an exact multiple of the number of points in both the horizontal and vertical direction.

Strange effects in regl

Well, it took a lot of browsing through example code, but eventually I had a map in regl with circles and opacities (although no "multiply" blending going on, I didn't yet know how to get that working). But again, if I zoomed in, I saw the same pixelated effect, AARGH! And here I was really hoping that regl would not have the same issue as Pixi... I did notice that it was a bit faster in rendering than Pixi though.

Regl map still pixelated

While trying to find info on getting anti-aliased circles in regl I came across a snippet that showed that Pixi actually has an "anti alias" setting! And not long before, Alastair Dant made this animated Pixi example that I tested with 50.000 circles which still seemed to work smoothly. These two interesting avenues to explore brought me back to my Pixi based map.

Btw, at some point during the data preparation I made a big change in how I saved the final files. However, I made an error, which gave the result below where the locations are randomly shuffled, oops... Not such an interesting map anymore 😅

Accidental random ordering of the pixel locations

Another hour or two of work adjusting the example by Alastair to my data, playing with some anti-aliasing things and I was finally looking at a smoothly changing map, YAY!

After I got Pixi working I started another Twitter request to help me with the anti-aliasing and "multiply" blend mode in regl (because I had noticed before that regl was faster than Pixi, so I wanted to give it another try). It wasn't long before Yannick Assogba send a block that was a remix on Peter Beshai's version, but then with circles instead of squares. And Alastair helped out again by making a block that animated a lot of circles using ReGL with anti-aliasing (looking amazing btw!). And a day before Robert Monfera had send a block showing how to transition between anti-aliased shapes in regl. These examples increased my understanding of how to tackle the anti-aliasing, which eventually led me to this blog that worked perfectly in my map. Look at those nice circles (even after zooming in):

Smooth anti-aliased circles in regl

Check! That only left the "multiply" blending that was missing from the the regl version. But that turned out to be one step too far. After a Twitter DM chat with Robert Monfera I had a collection of websites about WebGL blending functions, premultiplied alpha (don't ask...) and some Pixi source code files (Pixi was able to do "multiply" in WebGL, so maybe I could find a clue there). I was very surprised that I could not find a single example of a multiply blend in WebGL through Google (where the multiply was based on many elements overlapping, not just two predefined images). And I guess I shouldn't be surprised that I couldn't figure it out either, having only started learning about shaders 2 days before, hehe. I did get a lot of interesting other color combinations... Well, actually I did get one result where multiply was working (top right image), but I could not combine that with opacity/circle shapes (which, more than 2 days ago, I would've thought was weird, "opacity is separate from color right?"... well no, not quite in WebGL I've since learned). For those interested, these sites really helped me to understand the different blending functions: 1, 2, and a blending example by Alistair.

Different errors in trying to create a multiply blending

At some point the creator of regl, Mikola Lysenko, even started helping me, which was awesome, but we didn't manage to replicate the results that canvas & Pixi were showing 😖 - Eventually I had spend enough time experimenting with the blending. I therefore decided to leave it without any blending. The Pixi version was therefore going to be my "main" project site. However, since I had the regl version animating perfectly fine I decided to clean up the page; better color palette, adding titles and such, and link to it from the eventual main page. That way people can compare the different tool-based versions 🙂

Final result with regl

May 20th UPDATE | Apparently Ricky Reusser hadn't given up the "multiply" fight yet, and about a week after I published "A Breathing Earth" he tweeted a regl demo that had both multiply and opacity working (below is what I turned the demo into so I could compare it to, say, Illustrator). How amazing is that! So now I can say that all 3 versions are exactly the same ^_^ and regl is definitely the fastest. I even had to "slow" it down a bit otherwise it would just whoosh through a year in no time.

Adjusted version of Ricky Reussers example that shows that multiply is working

I had learned many things while going through all the wonderful examples people had send me. I therefore returned to the canvas version to see if I could make it faster. After a bit of messing around, I managed to get it to switch between maps every second, but that is still too slow for a smooth animation (nonetheless, I also cleaned up the canvas version and added it to the main page for comparison).

Btw, another new thing I implemented this month was breathe.js. I heard about it when attending the d3.bayArea meetup last April. I noticed that the final map preparation loop was freezing up the browser for ±10 seconds. And with some very minor changes using breathe.js that wasn't happening anymore, without it seeming to take longer to prepare the data 😀

Eventually I added a bit more text to the main final version (and a legend), keeping it very minimal so the focus is on the map. And quick links to the minimal versions that use canvas, Pixi and regl. For those interested, I've been timing myself since January, and this month took me 57 hours to complete (from ideation, data, sketching and coding (3x, once for each "tool")), of which I'm guessing the regl "multiply" stuff took at least 10 hours...

Final result of the 'Breathing Earth' visual

Having chosen the regl library to work with WebGL I decided not to go into ThreeJS as well. It was just too much to handle in one week, so many new programming libraries, hehe. Nonetheless, I still want to share these 3 examples that show the concept of changing circles that I received through Twitter: a block with 50.000 circles animated with WebGL custom made by Robin Houston and two codepens with 50.000 animated circles, one using circle "pentagons" in ThreeJS and another version using DataTextures in ThreeJS both custom made by Matt DesLauriers.

This was a very technical "month" for me. Sure, the visual in itself isn't so out of the box as the typical month, it's been done before in different ways, but as I often tell others, you can't expect to create wonderful, crazy new things when you're just starting out with a tool. Instead, I haven't learned so many new (coding) languages/libraries within a week, since, well, maybe ever. And I couldn't have done it without the help of a lot of people. I would specifically like to thank Robert Monfera, Alastair Dant, Ricky Reusser, Matt DesLauriers, Peter Beshai, Mikola Lysenko, Yannick Assogba, Robin Houston, Amelia Bellamy-Royds, Jan Willem Tulp, Mike Brondbjerg, Paul Murray, and Mathieu Henri who have all helped me in different areas of either Pixi, regl, ThreeJS and more. I don't think I would've been able to get my map(s) working without the ideas and examples they shared. THANK YOU!

san francisco

week 1 | data

week 2 | sketch

week 3 & 4 | code

introducing our guest
Sonja Kuijpers
Voices that care

It’s always nice to catch up with Nadieh and since we both live in the Netherlands we meet each other on several occasions such as the Infographics Congress last March where this time she dropped me a question: if I was interested in joining in as a guest on Data Sketches. If I was interested? I responded pretty calm (I think..?) but inside I went ‘hurray’ cause I greatly admire Nadiehs and Shirleys projects and their openness on the processes. Then when the idea of taken part landed I completely freaked for the same reason: I feel so unskilled compared to them doing their code, D3 and all that jazz. It didn’t quite help watching them these last months showing up on different venues, talking on their super-interesting well-designed datavisualizations and the tools they master. Luckily they both ensured me they asked me precisely for the reason I handle my projects differently from them. I come from an design-background; I studied Public Space at the Design Academy and formerly had a career at a landscaping /urban design company. I use other tools and I mainly work manually in Illustrator. Nadieh and Shirley would like to see how I tackle and visualize a dataset my own way, which can be kind of ‘monnikenwerk’ as we say in Dutch.

Well. My journey then.

week 1 | data

The subject first suggested was Charity. I feel there’s a lot of data out there by many great Charities that could be communicated in great ways but we all got depressed by the subject cause, you know, it almost always considers the amount of people (or animals) dying or going to die of some terrible disease, famine or natural or accidental disaster.

To not demoralize our viz-month the subject changed to community but I lingered on the charity subject because I was triggered by my memories of the Eighties. My nostalgia goes way more back than Nadiehs and Shirleys :-). I was becoming a teenager back then and got introduced to popmusic, popculture and idols. There was a vibe those days on community, on artists uniting to make a difference. They came together to record songs for charity. Many of those songs are considered tacky and that’s true in my opinion (all these horrible, horrible covers!) but I do must say I do a pretty well Cyndi Laupers “Well, well, well, let's realize that a change can only come” and always choke up during the chorus of “Do they know it’s Christmas”. (I have my flaws)

That same vibe isn’t around any more but I started wondering on the current state of Charity songs: are they still being created and who’s singing them and for what cause?

What’s to find in the data?

I started looking for data considering Songs for Charity. I found several lists including wikipedia.org/wiki/Charity_record which I do acknowledge is incomplete, but it was the best I could find and I had to start somewhere. I started completing and sorting the data: dates, singers, cause/category in Google Sheets using Scraper (Chrome add-on) and had to do a lot of manually searching and altering data due to misspelling, missing data on the artists, and sometimes aliases of artists. Again, I did the best I could but I’m sure I’ve overlooked some info and well, it’s ‘a’ list not ‘the’ list because it’s mostly western market oriented (with some exceptions) and probably many other songs are not listed. I’ve thought of visualizing which song made the most profit but that was undoable due to the timerange (songs dating back to the Eighties), non-traceable record-sales, the inflation and which way the money went eventually (that’s somewhat obscure as well sometimes). There’s just simply not enough data available on this subject. What I also considered was some viz on the lyrics so maybe a word count or focus on themes: hope, rescue, comic Relief etc.

There was data on the charities sang for so I started ordering, grouping and classifying the songs by their initial goal or regrouped them to a unifying category.

The categories being:

  1. famine, poverty, disadvantaged people, homeless / refugees
  2. diseases (aids, cancer, other diseases), health (common health)
  3. children
  4. Comic Relief
  5. equality / empowerement/ awereness
  6. victims of terror / war, support veterans / heroes / soldiers
  7. environment
  8. sports
  9. disaster: accident
  10. disaster: nature

week 2 | sketch

Now Illustrator is probably the program I’m best skilled at (acknowledging I know only this part) but I’m not going to dive into detail to much on the tools because that’ll take too much time and if you would like to know more I advise you to watch some beginner-tutorials, practice or contact me if you have any questions. As I was educated at the Design Academy and worked at a landscaping design company I had to use various design-programs so I learned on the go, but I must say I did and still make lots of styling/layout-decisions based on intuition.

At the start of a project I try to sketch what I’d like to achieve and sometimes I get the wildest ideas but there’s also this realization that keeping it simple (we’ve seen enough discussions on that subject) mostly works best. Conclusion this time: no circular layout (which I’m pretty fond of) or much decoration. Most of the times my sketches are rather messy. I’m just no Alfonso or Giorgia :-(

week 3 | design

Next to the sketching I also search for inspiration (mostly on my Pinterest boards which you can find here) and / or I have some style in mind which I like to apply.

I start of with creating swatches and try to pick some fonts to have some kind of frame to work with. This of course gets altered and supplemented during the process but personally I need those first “restrictions” to get going. And like I mentioned my styling has to do with intuition which I believe most of the times turns out just fine :-)

I started of creating a timeline which mentions all the songs and information on what went on at that time. But in the end well that’s all it is, a timeline, and I felt it wasn’t compelling / insightful enough to publish. Ta ta to the many hours spent on it but I’m still figuring out a way to publish it an intriguing way some other time. For now I felt I had to aim at something else and since our subject did change to “communities” that’s where I tried to shift my focus.

Since we all probably know that Michael Jackson was a great contributor to (children-related) charities and that Bono has been doing stuff with like every other famous artist on the planet, I was wondering how many songs in this list did artists write / sing / contribute to and with whom. As I did want to make my main-graph visually not to tangled / cramped or to big to fit on a screen I made the decision to only visualize the artists who sang (note: not play an instrument or organize or contribute in another way) more than twice. This left me with 26 artists to visualize. Now the main-graph was pretty straightforward at first: I just connected the songs on the timeline at the left, with the artists on the list at the right by drawing curvy lines. I used a gradient on these lines (along stroke!) so it all wouldn't get too meshed together. I also decided to split the songs per artist in ensemble category using small symbols. The Symbols-tool comes in very handy if you need to add many similar icons or other elements: you draw one single icon or other element on your artboard, then select this and add it to the Symbols-panel (just drag it into it and choose your settings) and you can place it in your design where needed (drag it from the panel into your design or copy-paste from first one). Then next, when you decide they need adjustments you do not need to redesign each individual element in your design, you just change the one symbol in the panel and all placed elements change accordingly. Not satisfied with the “connectedness”of artists I added another layer on the right showing the network of cooperations. And to get more insight in the totals of ensemble category I added another layer at the left. Since I left the timeline graph for what it was I felt I had to add depth on my current graph, creating clarity, answering my question “Who participated on which song and with whom of those 26 did they do that with?”. I fiddled around with some ideas and try-outs on what else to show/tell and soon came to these sort of small-multiple of each individual artist (Ooh, don’t we all just L.O.V.E. small multiples?!). It was quite elaborate to work this out, 26 times multiplying a basic graph and then deleting all unneeded elements, but somehow that’s where the fun starts for me: it’s somehow meditative to do these repetitive actions (to a certain amount of course!) and when convinced the turnout will be just fine I simply submerge in the task. And then. Then I lost almost everything due to my file crashing, damaged beyond repair!, my program also crashing, had to re-install, and of course in the rush I didn’t save alternative versions. I know, I know! Stupid me! Had to go into meditation-mode all over again but that’s a bit hard while in an disillusioned-wanting-to-throw-my-laptop-out-of-the-window-cursing mood.

So, I drank a large beer, got my act together and knowing the correct steps, knowing the time it would cost me (I learned from the first time around so I could skip some steps which I found out were unnecessary) and I got back onto it. I also decided to create the small multiples in a separate file and paste them into the final at the end to avoid the total getting to big/slow (and crash again). A small multiple of an artist (which sometimes is a duo or a group) shows only the connections of the artist of matter to the songs they sang and the connections to other artists they participated with. I emphasized the lines of the concerning artist by fading the other artists lines. I could have included much more info (how many artists in total per song /artist/ corporations other than in these songs, covers) but I found I had to stick to a certain amount of data to not clutter the graphs and avoid spending way more time on collecting data and this project as a whole.

I created a layout showing the artists small multiples in the same order as in the artist list in the main graph and added some info on Michael, Bono and Elton since they contributed the most.

I found out (too late) there’s this page on the world of celebrity giving, named “Look to the Stars” which gives a great insight in who donates to what charity and for which cause. I definitely would want to go work on that data in another project.

So I know there are no heavy conclusions to draw out of this viz but it does show some insight in which decade(-s) artists participated and with whom. Sometimes as expected, as which is the case with Michael Jackson and Bono, but sometimes somehow surprising: while also successful then, why was Jon Bon Jovi not participating in the Eighties? Now the whole process took way more time then anticipated (like it mostly does with autonomous projects) and due to the change in direction and the described ‘hiccup’ my viz is way overdue but I enjoyed the ride and I hope people will enjoy my visualization. The communication with Nadieh and Shirley was comforting and helpful as were the critiques by my partner Tamara who is also an (architectural) designer and is always there to glance over my designs and check if they’re comprehensible.

For now I consider this a wrap up!

Meantime I’ve also been working on my website (which was about time to go online) and I have a little treat on my independent-projects pages: I’m adding a link to the Adobe Color CC library so you can download the basic colors I’ve used in these projects. Here you find the colors for this dataviz, I hope you enjoy!

And just to get (back) into that 80’s vibe, listen / watch this Charity song: Stars by Hear N’Aid. Not my genre but epic! ;-)

Cheers, Sonja