/d3-twodim

A D3.v3.js module for creating 2D representations of data

Primary LanguageJavaScriptBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

d3-twodim

d3-twodim is a D3.v3.js module for creating two-dimensional representations of data for end-user exploration. Currently, d3-twodim can instantiate scatterplots using SVG, Canvas, or WebGL, and will (in the future) instantiate visualization techniques such as Splatterplots and subsampled scatterplots. This module uses a factory design pattern to keep d3-twodim components linked to one another in order to interchange data, data item state, and user interaction.

This project is under active development. Please feel free to file an issue, open a pull request, or contact the author with any feedback. If you use d3-twodim in your own project, we would love to hear about it!

An example instantiation of d3-twodim

View a live example on the GitHub-hosted project page, or build the source and navigate to the examples/ folder in a webserver.

Installing

Download the latest release, and include d3-twodim as a script source after including d3.v3.js. If you use NPM, npm install d3-twodim. Otherwise, you can modify and rebuild the library by calling npm install from the project root.

We recommend using a bash shell to install. On Windows, download git for windows, and make sure both zip and bzip2.dll are available in your path ([1]).

Example Instantiation

You can view an example instantiation within the repository by navigating to simpleExample.html after building the library.

d3-twodim uses the factory design pattern to keep track of all linked components. In your code, first create the factory by calling new d3_twodim.twoDimFactory(), then create objects using the factory's createComponent() method. To make your first scatterplot, you can simply do the following:

var twoDFactory = new d3_twodim.twoDimFactory();
var scatterplot = twoDFactory.createComponent({type: 'scatterplot'})
  .width(400).height(400);

// set the data
twoDFactory.setData([[1,1],[2,2],[3,3]);
d3.select('body').append('svg')
  .attr('width', 500).attr('height', 500)
  .append('g')
    .attr('class', 'scatterplot')
    .attr('transform', 'translate(50,50)')
    .call(scatterplot, '.scatterplot');

The real power comes from linking components together -- for example, you could have one scatterplot looking at the first two dimensions of your data, and the next scatterplot looking at two other dimensions. When you brush over one scatterplot, the corresponding points in the other scatterplot also update.

var scatterplot = twoDFactory.createComponent({type: 'scatterplot'})
  .width(400).height(400)
  .doBrush(true)
  .fields(["dim1", "dim2"]);
  
var scatterplot2 = twoDFactory.createComponent({type: 'scatterplot'})
  .width(400).height(400)
  .doBrush(true)
  .fields(["dim3", "dim4"]);
  
svg.append('g')
  .attr('class', 'scatterplot')
  .attr('transform', 'translate(50,50)')
  .call(scatterplot, '.scatterplot');

svg.append('g')
  .attr('class', 'scatterplot2')
  .attr('transform', 'translate(500, 50)')
  .call(scatterplot2, '.scatterplot2');

There are several other options you can add to enhance the functionality and interaction between your d3-twodim components. The scatterplot component in particular exposes mouse{over,down,out} and click events to enable custom interaction, such as showing tooltips.

var scatterplot = twoDFactory.createComponent({type: 'scatterplot'})
  .width(400).height(400)
  .on('mouseover', function(d, ptPos) {
    tooltip.transition()
      .duration(200)
      .style('opacity', 0.9);
    tooltip.html(d.author + ": " + d.title)
      .style('left', ptPos.left + "px")
      .style('top', ptPos.top + "px");
  })
  .on('mouseout', function(d) {
    tooltip.transition()
      .style('opacity', 0);
  });

There are also legend, objectlist, and dropdown components to interact with the scatterplot. Example instantiation of these components can be seen in the simple example.

Scatterplot Components

Scatterplot-like designs can also be constructed by extending the scatterplot_component prototype. Given a set of data, such a component only needs to implement a draw(), update(), and optionally a highlight() method. See bins.js for an example implementation.

var bins = twoDFactory.createComponent({type: 'scatterplot', render: 'bins'})
  .width(width)
  .height(height)
  .fields([0,1])
  .circleSize(7);

For convenience, several WebGL shims are available for use in webgl_utils.js. You can see how these shims are used in the implemented Splatterplot component, splatterplot.js.

API Reference

The Factory

The twoDimFactory object ties all the d3-twodim components together. If you create all your components through this factory, it will handle passing data and triggering updates for you. As an example, a highlight call will tell all linked components to emphasize data points that match the given function.

# d3_twodim.twoDimFactory() <code>

Creates a d3-twodim factory, where all instantiated objects are linked with the same data and share a global d3.dispatch object.

# twoDimFactory.createComponent({type: "component_type"[, render: "{svg|canvas|webgl}"]} <code>

Creates a d3-twodim component of the given type, and returns the object representing the requested component. The type field is required in the options anonymous object, and component_type must be a one string of scatterplot, objectlist, dropdown, or legend.

If you are creating a scatterplot object, you may also add the render field, which can be one string of the following: svg, canvas, webgl, or splatterplot. Scatterplots will default to svg rendering.

# twoDimFactory.setData(data) <code>

Sets the data for all components instantiated by this factory. Expects data in an array format, where every element conforms to a standardized, consistent anonymous object format or array of consistent sizes. See D3's data() documentation for more detail.

# twoDimFactory.setGroupColumn(selector) <code>

Sets the function that determines the group name of a given point. The given function selector takes an arbitrary data point, and returns a string representation of its group membership. This function is shared with any instantiated scatterplot and legend components.

# twoDimFactory.setGroupField(groupField, [numBins]) <code>

Sets the categorical column name that will be used to group points. Shorthand for calling setGroupColumn. The given string groupField is converted to a function and is shared with any instantiated scatterplot and legend components. If groupField is continuous, consider passing numBins to discretize the field into that number of equally-sized bins.

# twoDimFactory.highlight(highlightFunction) <code>

Programmatically kicks off a highlight dispatch to all instantiated components from this factory. With the given highlightFunction, causes the matched objects to have their 'highlighted' behavior enabled (much like a given funtion to filter()) and trigger a redraw on all linked d3-twodim components.

Scatterplot

The scatterplot object is king, queen, bishop, and rook. While there is a small suite of options to change currently, the scatterplot object should be designed to best support the task at hand. With browser performance limitations of SVG, using WebGL rendering can be beneficial in situations with many points (say, over 20k). In the future, this object will support other scatterplot-like transformations, such as binning, subsampling, density estimation, etc. If you have ideas, please file an issue.

WebGL support is under development. We are toying with the idea of providing scatterplot components, which share the same set of WebGL utilities, but with minimial overhead to implement other WebGL scatterplot implementations. Currently, splatterplot is a valid render type, which will activate the splatterplot.js component. Feel free to add your implementation as a component through a pull request.

Limitations (v0.5): Currently, only SVG is fully featured. Canvas and WebGL rendering modes lack interaction support, including event handling and highlight dispatches.

# d3_twodim.scatterplot(dispatch) <code>

Constructs a representation of a scatterplot, attached to the given dispatch object to receive highlight and groupUpdate dispatches. Like other D3 reusable components, all functions return this object for method chaining if at least one argument is supplied.

Should only be called indirectly by using d3_twodim.twoDimFactory using the createComponent method, e.g.

var factory = new d3_twodim.twoDimFactory();
var scatter = factory.createComponent({type: "scatterplot", render: "webgl"});

# scatterplot(selection, name) <code>

Kicks off a render of the scatterplot object on the given selection. Following D3.js convention, this should be executed on a selection, such as:

d3.select("g.scatterplot").call(scatterObj, ".scatterplot");

where the first argument is the selection (e.g., g.scatterplot) and the second argument is the string (e.g., ".scatterplot"; necessary to namespace dispatched events). Currently, this is the only way to force a re-render of the scatterplot — so if data is changed via factory.setData() or any appearance attribute is changed after the first render, this method must be called again.

For svg rendering, the selection is expected to be a <g> SVG element, created before calling this object.

For canvas and webgl elements, the selection is expected to be some sort of block container, such as <div>. Any existing SVG or canvas elements within this container will be removed. To support interaction, initializeCanvasLayers() is called to construct and align an SVG component directly on top of a canvas element.

Mouse events mouseover, mousedown, mouseout, and click for drawn points are exposed on the scatterplot selection. You may bind listeners to these functions, where the listener will be called with this as the interacted point, the first argument d is the data object bound to that point, and mouseover's second argument ptPos describing the location of the point within the scatterplot (helpful for tooltips or prompting an interface change).

An example of binding a listener to these events:

twoDFactory.createComponent({type: 'scatterplot'})
  .width(400)
  .height(400)
  .fields(['left', 'right'])
  .on('mouseover', function(d, ptPos) {
    tooltip.show().position({left: ptPos[0], top: ptPos[1]}).data(d);
  })
  .on('mouseout', function() {
    tooltip.hide();
  });

Currently, a SVG clip mask hides points that fall outside of the chart area (this happens by default on Canvas and WebGL).

To highlight points on the scatterplot, pass a filter selection function to the twoDimFactory.highlight() function. Those points that are not selected by this function will have the CSS class "point-hidden" applied to them (which should be styled by the user). This class name can be changed by the scatterplot method hiddenClass().

# scatterplot.data(newData[, key]) <code>

Gets or sets the data bound to points in the scatterplot. Following D3.js convention, newData should be an array of anonymous objects. Generally set all at once by the twoDimFactory.setData() method.

The key function is passed to d3.data(). If no key data function is given, each data element is given an additional field orig_index, describing the original order of data items in the given dataset, and the key function then uses this field. Regardless of the state of the key argument, when all filters and highlights are removed from the data, the orig_index field is used to preserve the initial drawing order of the points.

If fields have not been defined yet, and the given data items are not arrays, automatically selects the first two fields it finds as being the x- and y-dimensions (regardless if those fields contain continuous data or not). To change this behavior, call scatterplot.fields().

# scatterplot.renderType([renderType]) <code>

Gets or sets the type of rendering mechanism, renderType should be one of the strings "svg", "canvas", or "webgl". Usually set on instantiation with the factory object, see twoDimFactory.createComponent.

Subsequent calls of scatterplot on a selection will populate the selections with the given rendering type. Changing the rendering type should change the selection if moving to or from an "svg" render type, see scatterplot for more details.

# scatterplot.width([width]) <code>

Gets or sets the width of the constructed scatterplot, defaults to 1. The caller is responsible for maintaining sensible margins, meaning that this width defines the drawable graph area of the scatterplot, and not necessarily the graph ammenities such as axis labels and ticks. A generally safe margin is 30 to 50 pixels.

# scatterplot.height([height]) <code>

Gets or sets the height of the constructed scatterplot, defaults to 1. The caller is responsible for maintaining sensible margins, see scatterplot.width.

# scatterplot.x([xSelector]) <code>

Gets or sets the accessor to determine the continuous x-dimension value from each item in the dataset. Default value pulls the first index from a data item if each data item is an array, or the first object field otherwise (see scatterplot.setData).

# scatterplot.y([ySelector]) <code>

Gets or sets the accessor to determine the continuous y-dimension value from each item in the dataset. Default value pulls the second index from a data item if each data item is an array, or the second object field otherwise (see scatterplot.setData).

# scatterplot.xLabel([xName]) <code>

Gets or sets the x-axis label for the scatterplot. Can be restyled by selecting ".xaxis .alabel' after rendering the scatterplot.

# scatterplot.yLabel([yName]) <code>

Gets or sets the y-axis label for the scatterplot. Can be restyled by selecting ".yaxis .alabel' after rendering the scatterplot.

# scatterplot.labels([[xName, yName]]) <code>

Gets or sets the x- and y-axis labels for the scatterplot concurrently. A given argument must be an array of strings of length 2.

# scatterplot.xField([xField]) <code>

If data items are anonymous objects, gets or sets the field name from which to extract the x-dimension. This function updates both the xValue function, accessible from scatterplot.x, and the label name for the x-axis, see scatterplot.xLabel.

# scatterplot.yField([yField]) <code>

If data items are anonymous objects, gets or sets the field name from which to extract the y-dimension. This function updates both the yValue function, accessible from scatterplot.y, and the label name for the y-axis, see scatterplot.yLabel.

# scatterplot.fields([[xField, yField]]) <code>

Gets or sets the x- and y-field values concurrently. A given argument must be an array of strings of legnth 2.

# scatterplot.circleSize([circleSize]) <code>

Gets or sets the size of the circle that represents objects in the scatterplot.

# scatterplot.changeDuration([duration]) <code>

Gets or sets the duration of animated transitions (in milliseconds) when updating the scatterplot bounds, axes, or point locations.

# scatterplot.pointIdentifier([newIDFunc]) <code>

Gets or sets the key function for the scatterplot data, see scatterplot.data.

# scatterplot.groupColumn([grpVal]) <code>

Gets or sets the function to extract the group membership for each data element. By default, this function is null, implying that all points are members of one data series.

# scatterplot.colorScale([colorScale]) <code>

Gets or sets the d3.scale object that maps the groupColumn to a color. An ordinal scale (such as d3.scale.category20b) can be used for categorical data, while a quantize scale should be used for group values that are continuous.

# scatterplot.bounds([newBounds]) <code>

Gets or sets the bounds of the scatterplot. The bounds are given as a 2D array, of the format [[xmin, xmax], [ymin, ymax]]. The scatterplot needs to then be called on the selection in order to prompt a render to show the updated bounds.

# scatterplot.doBrush([doBrush]) <code>

Gets or sets whether the scatterplot should implement a rectangular brushing component, similar to d3's brush. By changing this value, the component is added to or removed from the scatterplot. Note that activating this component with the doZoom option is not supported (the function of the mouse is overloaded).

# scatterplot.doVoronoi([doVoronoi]) <code>

Gets or sets whether the scatterplot should generate a voronoi overlay for an easier end-user experience of interacting with points. By changing this value, the component is added to or removed from the scatterplot.

Generated voronoi cells are linked to the points they represent, and bound mouse events to the scatterplot are rebound to these generated voronoi cells. By default, a voronoi overlay is created for all points. To only activate the voronoi overlay for highlighted points (e.g., for performance reasons), pass true to squashMouseEvents().

# scatterplot.doZoom([doZoom]) <code>

Gets or sets whether the scatterplot should support panning and zooming with the chart area. By changing this value, this functionality is added to or removed from the scatterplot. Note that activating this component with the doBrush option is not supported (the function of the mouse is overloaded).

# scatterplot.squashMouseEvents([doLimitMouse]) <code>

Gets or sets whether mouse events should be fired when no points are highlighted. With the default value of false, all point-based mouse events will be fired. When set to true, this disables voronoi generation and firing mouse events when no points are highlighted, which can result in great redraw performance savings.

# scatterplot.supressHiddenPoints([newSupress]) <code>

Gets or sets whether the scatterplot supresses mouse events for hidden (non-highlighted) points. By default, this is true, meaning that mouse events will not be fired for those points and associated voronois if those points are hidden by a highlight event. By changing this value to false, all points can be targeted by mouse events, regardless of the point state.

# scatterplot.hiddenClass([newClass]) <code>

Gets or sets the CSS class that is set when points are hidden (applied to those points that are not explicitly highlighted). This can help avoid CSS namespace collisions if the default class point-hidden is taken by an external CSS dependency.

# scatterplot.autoUpdateBounds([updateBounds]) <code>

Gets or sets the boolean value that determines whether the scatterplot should automatically rescale its bounds whenever the data accessors change (i.e., the x-value, y-value, or group value changes). The default is true; the scatterplot will rescale when new accessors are chosen and a redraw triggered. If set to false, the user is responsible for calling the bounds() function and triggering a redraw to rescale the scatterplot.

Legend

The legend component provides the color key for the scatterplot. The legend should be updated through the factory using the twoDimFactory.setGroupColumn method, which will update all linked scatterplots and legends with the grouping function and color scale to maintain consistency.

# d3_twodim.legend(dispatch) <code>

Constructs a representation of a legend, attached to the given dispatch object to receive highlight and groupUpdate dispatches. Like other D3 reusable components, all functions return this object for method chaining if at least one argument is supplied.

Should only be called indirectly by using d3_twodim.twoDimFactory, e.g.

var factory = new d3_twodim.twoDimFactory();
var legend = factory.createComponent({type: "legend"});

# legend(selection, name) <code>

Kicks off a render of the legend object on the given selection. Following D3.js convention, this should be executed on a selection, such as:

d3.select("g.legend").call(legendObj, ".legend");

where the first argument is the selection (e.g., g.legend) and the second argument is the string (e.g., ".legend"; necessary to namespace dispatched events). Currently, only rendering to an SVG element is supported.

The legend exposes the click event, fired whenever a color or group is clicked in the legend. The callback should have one argument, where the first argument d contains the field name (a string representing the group) and active (a boolean representing whether this group is actively drawn).

By default, clicking on a group in the legend will toggle its visibility in all linked scatterplot components. This behavior cannot currently be overridden.

# legend.data([newData[, key]]) <code>

Get or sets the data bound to points displayed in the scatterplot. Following D3.js convention, newData should be an array of anonymous objects or an array of arrays. Generally set all at once by the twoDimFactory.setData() method.

# legend.groups([*newGroups, key]) <code>

Gets or sets the groups and color scale to display in the legend. On a twoDimFactory.groupUpdate dispatch (generally from scatterplot during a render), this function updates the data contained in the legend and prompts a redraw.

# legend.groupColumn([grpVal]) <code>

Gets or sets the function to select the group membership from an item from the dataset. Generally set from the factory object by calling twoDimFactory.setGroupColumn, which will handle updating all linked scatterplot and legend components to show consistent group mappings to colors.

Dropdown

Dropdowns provide an easy interaction mechanism to update the selected points and which dimensions are displayed in the scatterplot. In practical use, dropdowns either kick off highlight triggers to other elements by calling twoDimFactory.highlight with the selected data, or trigger scatterplot renders by changing rendering options.

# d3_twodim.dropdown(dispatch) <code>

Constructs a representation of a dropdown, attached to the given dispatch object to receive highlight dispatches. Like other D3 reusable components, all functions return this object for method chaining if at least one argument is supplied.

Should only be called indirectly by using d3_twodim.twoDimFactory, e.g.

var factory = new d3_twodim.twoDimFactory();
var dropdown = factory.createComponent({type: "dropdown"});

# dropdown(selection, name) <code>

Kicks off a render of the dropdown object on the given selection. Following D3.js convention, this should be executed on a selection, such as:

d3.select("div#dropdown").call(dropdownObj, "#dropdown");

where the first argument is the selection container (e.g., div#dropdown) and the second argument is the string (e.g., "#dropdown"; necessary to namespace dispatched events). Currently, only rendering to a block display DOM element is supported. Only DOM elements matching <select> will be destroyed or modified through this call.

The mapFunction determines whether this dropdown will be a select or multi-select. If this option has been changed between rendering calls, the <select> element will be destroyed and re-created.

The dropdown exposes the change event, fired whenever a different item is selected in the dropdown. The callback should have one argument, where the first argument d is exactly the string and value shown in the dropdown <option>. To capture the options selected in the dropdown, use something such as:

var factory = new d3_twodim.twoDimFactory();
var dropdown = factory.createComponent({type: "dropdown"})
  .mapFunction("dims")
  .on("change", function() {
    var selected = d3.select(this).selectAll('option')
      .filter(function(d) { return this.selected; });
  });

# dropdown.data([data[, key]]) <code>

Get or sets the data bound to points displayed in the scatterplot. Following D3.js convention, newData should be an array of anonymous objects or an array of arrays. Generally set all at once by the twoDimFactory.setData() method.

# dropdown.mapFunction(func|"dims"|"values", columnName) <code>

Gets or sets the method of generating the values to fill the dropdown. There are three ways to pass data to this function:

  • dropdown.mapFunction(func): Use func to generate relevant values, where func accepts the entire dataset as an argument.
  • dropdown.mapFunction("dims"): Generates a function to obtain the dimension names from the data (assuming that a data item is an anonymous object and the field names are all addressable dimensions). Sets the dropdown to be a single-select (classic dropdown).
  • dropdown.mapFunction("values", columnName): Generates a function to pull all unique values from the given columnName. Sets the dropdown to be a multi-select (users can Opt- or Ctrl-click multiple entries).

Object List

Provides a simple list that updates whenever highlight is triggered on it. In the future, this may aggregate similar sort of objects together to form a concise list.

# d3_twodim.objectlist(dispatch) <code>

Constructs a representation of an object list, attached to the given dispatch object to receive highlight dispatches. Like other D3 reusable components, all functions return this object for method chaining if at least one argument is supplied.

Should only be called indirectly by using d3_twodim.twoDimFactory, e.g.

var factory = new d3_twodim.twoDimFactory();
var objectlist = factory.createComponent({type: "objectlist"});

# objectlist(selection, name) <code>

Kicks off a render of the objectlist object on the given selection. Following D3.js convention, this should be executed on a selection, such as:

d3.select("div.objectlist").call(objectListObj, ".objectlist");

where the first argument is the selection container (e.g., div.objectlist) and the second argument is the string (e.g., ".objectlist"; necessary to namespace dispatched events).

# objectlist.data([data[, key]]) <code>

Get or sets the data bound to points displayed in the scatterplot. Following D3.js convention, newData should be an array of anonymous objects or an array of arrays. Generally set all at once by the twoDimFactory.setData() method.

# objectlist.filter(filterFunc) <code>

Gets or sets the filter function that determines which data items should be displayed in the list. By default, filterFunc rejects all items in the dataset. Global highlight dispatches from twoDimFactory.highlight calls this method (along with all linked components) with the supplied function.

# objectlist.pointToString(ptStringFunc) <code>

Gets are sets the function that transforms matched points into a string representation. This function should take a data item and return a string-only representation of that object.

To-do list

  • Add ability to lasso points
  • Add ability to programmatically select points
  • Add ability to view categorical data (see #4)
  • Support missing data (can make internal functions error; see #9)
  • Allow user to see statistics about selected points (in relation to background)
  • Allow interaction with drop-downs to select relevant dimensions for the user, or search for particular text of a point
  • Add pairwise correlation matrix component (shows level of correlation between two features)
  • Add Splatterplot component (add-on to WebGL rendering type)
  • Add subsampled graph option (add-on to SVG/Canvas rendering type?)
  • Add binning component (add-on to SVG/Canvas rendering type)
  • Add labeling options (for outliers?)