• Foundations of Random Number Generation in JavaScript

    Being able to generate (apparently) random numbers is an essential requirement for many areas of math, statistics, the sciences, technology and gaming. For instance, they can be used to assign participants to a group in a randomized controlled trial, to determine the output of a computer game or to find approximate solutions to otherwise intractable problems. I frequently use random numbers to check my solutions to perfectly…

    • Thu, Jul 14 2016
  • SVG versus Canvas

    Suppose you want to draw something on your web page using browser-native technologies. It might be some kind of animated scene, it might be a technical diagram, it could be some kind of custom infographic. What should you use? As the title of the article suggests, there are a couple of obvious answers: a canvas element or a scalable vector graphic (SVG). In some cases, either SVG or canvas might work reasonably well and…

    • Thu, Jun 23 2016
  • The Importance of Prose in Communicating Data

    If you're a data communicator, having a good understanding of chart and table design is important. Thankfully, the art and science of creating effective charts and tables is the subject of a great number of books. (One of my favorites is Stephen Few's Show Me the Numbers.) This doesn't, however, mean that how we use ordinary prose - spoken or written - should be ignored.

    About a year ago I wrote an article…

    • Thu, Jun 2 2016
  • New Solutions to Old JavaScript Problems: 2) Default Values

    Introduction

    This is the second in a series on how new JavaScript features, introduced in the ECMAScript 2015 standard (aka ES6), allow for simpler solutions to some old JavaScript problems. In part 1 I covered block scope and the new let and const keywords. Here I will look at default arguments for functions.

    Default Arguments the Old Way

    Here's a very simple JavaScript function for logging a greeting message to the…

    • Mon, May 9 2016
  • When it Comes to Dataviz, Color is Complicated: Part 2

    This is Part 2 (in a series of 2) on why color is a complex and confusing topic. In Part 1 I looked at cases where colors might not be interpreted as expected. Here I'll cover the difficulties of picking a suitable palette.

    Be Subtle

    Even if you avoid color contrast illusions and palettes that are difficult for those with CVD to interpret, it's still easy to make something that looks bad. Strong, saturated, vibrant…

    • Mon, May 2 2016
  • When it Comes to Dataviz, Color is Complicated: Part 1

    In all the articles I've written here I've covered a fairly broad range of topics related to data visualization: the use of tick marks and labels, data density, the problems with dual-axis charts and much more. I've touched upon the use of color a few times but only in passing. That's because I think, while interesting, the topic can be quite confusing and that makes writing short articles difficult. In this two-part…

    • Wed, Apr 27 2016
  • New Solutions to Old JavaScript Problems: 1) Variable Scope

    Introduction

    I love JavaScript but I'm also well aware that, as a programming language, it's far from perfect. Two excellent books, Douglas Crockford's JavaScript : The Good Parts and David Herman's Effective JavaScript, have helped me a lot with understanding and finding workarounds for some of the weirdest behavior. But Crockford's book is now over seven years old, a very long time in the world of web…

    • Tue, Mar 22 2016
  • Stacked Area Charts and Mathematical Approximations

    I've previously noted that I think stacked area charts are frequently used when a conventional line chart would be a better option. Here is the (fictional) example I used previously and the conventional line chart alternative.

    In short, if you want people to be able to make reasonably accurate judgments of the magnitudes of the individual components, and how they change depending on some other variable (such as time…

    • Thu, Feb 25 2016
  • Why We Should Report More Than Just the Mean

    Numbers without context are of very limited use. So it's a good thing that articles in newspapers and reports in the wider world will often compare the figures they relay to the (mean) average. But invariably that simply isn't enough to get a gauge of what the data being reported really tells us. There's an old "joke" about a statistician who drowned in a lake of average depth a few inches (the precise average depth seems…

    • Thu, Feb 18 2016
  • A Step-by-step Introduction to JavaScript Sets

    As mentioned previously, we have a new JavaScript standard commonly known as ECMAScript 6 (ES6). I've spent quite a bit of time recently reading around the new features outlined in the standard that are coming to, or have been recently implemented in, our browsers. One of my favorite additions is the new Set type. A set is somewhat like an array. It's a place to store values: numbers, strings, objects, actual arrays…

    • Fri, Feb 12 2016
  • Conveying the Right Message

    We communicate data and the conclusions we've drawn from our datasets in numerous ways: conversations, presentations, reports, charts shared on Twitter... Frequently, the results will be compressed. After all, we can't write a 200-page thesis on every bit of research. Instead we might publish a short article with a headline. And busy people may only read the headline. The end result can be something like a game of …

    • Mon, Feb 8 2016
  • Introducing JavaScript's Math Functions

    As of June 2015, the world has a new JavaScript standard. Officially called ECMAScript 2015, but commonly known as ECMAScript 6 (ES6) and sometimes ECMAScript Harmony or ECMAScript.next, the specifications detail a broad range of new features you can expect to see being implemented in browsers in the coming months and years. This includes — but is not limited to — block-level scoped variables let and cons…

    • Tue, Feb 2 2016
  • Demystifying Box-and-whisker Plots — Part 2

    Having shown you how to read range bars and box-and-whiskers in Part 1, I now want to use some real-world data to illustrate why they can be useful. Specifically, I'm going to use data relating to the UK general election of 2015. First, for those not familiar with the UK's political system, I'll give a brief overview of how our electoral system works.

    The UK is divided in to 650 constituencies for the purpose…

    • Tue, Jan 26 2016
  • Demystifying Box-and-whisker plots — Part 1

    If you browse through a large, printed, newspaper and pick out all the charts you find you'll probably come across some of the following: bar chart, timeseries (line chart), pie chart, donut chart, stacked area chart. You may come across the odd scatter plot too. If you've picked up the New York Times on a good day then you might even stumble into a connected scatter plot. What you're highly unlikely to find is…

    • Mon, Jan 25 2016
  • Image Manipulation with HTML5 <canvas> element

    Introductions too frequently concentrate on how it allows web developers to draw all manner of graphic objects, from straight lines and rectangles to complex Bezier curves, on to the screen. Here, however, I'd like to focus on another use case: photo-editing in the browser. If you're keen to see what can be done with canvas right away then skip on down to the interactive examples below and come back here when you want…

    • Wed, Dec 2 2015
  • Minimalist Maps: Are They a Good Idea?

    Data maps are everywhere. And it's not just the conventional ones that use Google Maps, OpenStreetMaps or Bing Maps to show the underlying geographical information. Cartograms, "maps" with land masses resized based on data, are quite popular. I'm not a big fan of them because they require us to judge magnitudes based on the relative sizes of some peculiar and many-sided shapes. Generally, we're not very good at this…

    • Tue, Dec 1 2015
  • Choosing the Right Way to Flatten the Earth

    The art and science of drawing our 3D Earth on to a 2D sheet of paper or computer monitor is worthy of a book. Unfortunately, I've only got a few hundred words. As a result, I'm going to largely concentrate on a single, controversial choice: the Mercator projection.

    When evaluating map projections we're not just talking about finding a suitable representation for a sphere in Flatland, because the Earth isn…

    • Wed, Nov 18 2015
  • Jitter - Another Solution to Overplotting

    Back when I discussed tricks for coping with overplotting I omitted (at least) one popular "solution": jittering the data. Jittering, is the process of adding random noise to data points so that when they are plotted, they are less likely to occupy the same space. It is most commonly used when the data being plotted is discrete. In such cases, in the absence of jitter, it's not just that the edges of data-point markers…

    • Wed, Nov 11 2015
  • Visualizing the Data Behind Your Images

    I think it's an interesting exercise to visualize data contained in a photograph as we might other datasets. Some cameras and photo-editing software do just this, constructing histograms from the red (R), green (G) and blue (B) values (the three "primaries") in the image to help expert photographers/editors judge and improve color balance. To illustrate this idea, I'll use the color photograph and its grayscale counterpart…

    • Mon, Nov 9 2015
  • Simplifying Visual Search for Presentations

     

    When giving presentations it's very tempting to try and squeeze as much information as possible into a short time slot of perhaps only ten or fifteen minutes. Practice certainly helps when it comes to matching the talk to the time slot, but we really should also consider whether the audience is likely to have had enough to time to absorb all the information we've thrown at them.

    One of the difficulties is that…

    • Mon, Nov 2 2015
  • How to Improve Your Data Visualizations with Annotations

     

    Merriam Webster defines an annotation as "a note added to a text, book, drawing, etc., as a comment or explanation". Chart annotations can provide extra detail, highlight points of interest or simply be used for disambiguation purposes. However, filling a graphic with annotations can distract from the visual salience of the data itself, so it's important to find the right balance. If we say that a charts title, axis…

    • Tue, Oct 27 2015
  • Some Thoughts on Data Density

    Back in February I wrote about 7 Do's and Don'ts of DataViz. My first "don't" was "Don’t use a chart when a sentence will do" with the accompanying bar chart:

    As noted in that article, the bar chart above can be omitted in favor of a simple sentence like "237 of the respondents preferred Product A, while only 112 preferred Product B" without any reduction in understanding to the…

    • Mon, Sep 7 2015
  • Charts and Cycles of Time - Part 2

    As discussed in Part 1 the quirks of our time-keeping systems can lead to "interesting" patterns in time series charts. Here's the example I used previously for ease of reference:

    To summarize the significant bits of this chart in words: The daily totals are dependent on day of the week, with sales on Saturdays being higher than on weekdays and sales on Sundays much lower. This helps create a cyclical…

    • Tue, Aug 25 2015
  • Charts and Cycles of Time - Part 1

    Our ability to get by with every-day life relies on all sorts of estimates relating to time. We estimate the time it takes to get ready in the morning and set the alarm clock accordingly. When walking we make judgments about how long we think it will take to cover the distance to the other side of the road and how long we think it will take any approaching vehicles to reach the crossing point. And so on.

    Our estimates…

    • Mon, Aug 24 2015
  • An Introduction to Small Multiples

    In my last article I argued that there's still a place for GIFs in data visualization on the web. A GIF can be used to illustrate how a measure or measures have changed over time or vary based on a third, categorical, variable. Small multiples — collections of small (obviously) graphics where the same variables are plotted in each graphic but the data in each graphic are conditioned based on another variable (or two)…

    • Tue, Aug 11 2015