March 20th update: you can download a demo of this control now and Denis has also posted an article on this development.
How to picture a 5-dimensional sphere or a see the correlation of 112-dimensional weather data? The answer was found in 1885 when parallel coordinates were invented and later refined in 1959 through Inselberg. The idea is actually very simple, given some N-dimensional space of data you draw N parallel lines and each line has its own scale. Plot a given data point on these lines by creating a line between the N point of this multi-didmensional point. In this way you get a line per data point and the series of given data points are mapped to a series of lines on the graph. The situation being much like projective geometry where we also find a back-and-forth thinking between lines and points in space. The good thing about such a representation is that if points have some accumulation in a particular dimension you will see it because lines will accumulate on the axis representing this dimension (see adjacent image). This kind of correlation can usually only be detected through statistical analysis, but what do you do if you want to record or supervise data which continuously comes in from real-time systems? This would mean a continuous mathematical analysis of the sampled data and this would entail some very custom algorithm bound to the particular context.
A customer of ours recently challenged us to implement the parallel coordinates visualization in WPF. While quite straightforward at first sight it proved to be more difficult than we initially thought for two reasons:
- non-numeric data: what to do if you have some string data in one of the dimensions? Say, you want to visualize in one of the dimensions a timestamp or telephone numbers or names of patients.
- how to emphasize the accumulation points in the graph? While in the screenshots above you can actually see well the discrete tendence in the second dimension, the screenshot on the right bestows you with a challenge to detect it.
The first problem was tackled through an internal mapping of the data which spreads the non-numeric data across an axis. The drawback of this is that at various levels one has to switch between the actual
data and the internal representation for labels, line hit during hovering, the histogram (see below) and any calculation in general. The second was more difficult and entailed some analysis of the data and various visual alternatives to attract the attention of the user (human interpreter of the data). The end-result is an histogram which like a rotated bar chart on the axis graphs the relative amount of crossings on the axis. The histogram actually shows also the opposite of an accumulation, since (see the second axis in the screenshot below) an equal spreading of the histogram’s rectangles means an equal distribution of the lines across the dimension axis.
Other minor features we added to this data visualization are; the possibility to hover over the graph and see the actual data of the multi-dimensional points, datagrid binding and editing, customization of look & feel through dependency properties, swapping data axes, scaling data vertically on a particular axis.
Probably the most satisfactory moment for myself during the development was at the point when it became clear that just any CSV/spreadsheet data can be bound to the parallel graph visualization. Indeed, due to the internal mapping mechanism one can present any type of tabular data and as a proof of concept I mocked some random spreadsheet in Excel which was subsequently visualized without a hitch.
Stocks & shares anyone? Let me know.






