Looking for library/tool to visualise multi-dimensional data [closed]
Asked Answered
B

8

11

I'm using Python in an attempt to analyse a large chunk of empiric measurements. In essence, I've two functions transforming the empiric data which also takes 3 'count' parameters - and returns a sequence of floats in each configuration. I'm expecting (hoping) to see some interesting patterns emerge when appropriate parameters are selected. I anticipate that the patterns might be relative between sequences returned for each function - and/or relate to patterns of some kind in the parameters. In case it's relevant, the 3 'count' parameters roughly correspond to:

  • A 'window size' on the underlying data over which summary statistics are calculated
  • A number of consecutive windows used to compute a single summary statistic (i.e. the trade-off between greater spatial or greater temporal accuracy)
  • An 'minimum age' - an offset into history of the underlying data.

The summary statistics (which generate the resulting sequences of floats for each parameter configuration) are non-trivial but will be independently sensitive to all three parameters.

I'm interested in visualisation techniques - suited to RAD/ad-hoc enquiry that will help me experiment with this multi-dimensional data.

So far, I've tinkered with MatPlotLib but find being restricted generating two graphs of 2/3 dimensions in the style of batch processing makes investigation very tedious. Ideally, I'd find a tool that would allow me to visualise more than two dimensions... perhaps allowing me to switch real-time between dimensions in an interactive GUI.

I'd really appreciate hints from any visualisation gurus as to suitable tools I should investigate - ideally to integrate with my existing Python functions - or in other languages. I'd especially like to hear any anecdotes of success with similar visualisation problems.

EDIT to add: One possible approach I'm considering is to use animation on 2 or 3D plots (to capture another dimension... leaving 1 or 2 for manual selection)... though I've found no good tools to help me achieve this, yet.

Bilodeau answered 21/1, 2012 at 15:33 Comment(3)
Matplotlib is great and does support 3D plots.Compliancy
orange.biolab.si mydatamine.com/?p=1100Immanent
The orange.biolab.si stuff looks neat - but I can't see any way to use it to visualise this sort of multi-dimensional data.Bilodeau
K
12

RGL is a visualization device system for R, using OpenGL as the rendering backend. An rgl device at its core is a real-time 3D engine written in C++. It provides an interactive viewpoint navigation facility (mouse + wheel support) and an R programming interface.

RGL Screenshot

GGobi is an open source visualization program for exploring high-dimensional data. It provides highly dynamic and interactive graphics such as tours, as well as familiar graphics such as the scatterplot, barchart and parallel coordinates plots. Plots are interactive and linked with brushing and identification.

GGobi Screenshot

There's a tutorial that covers both of the above systems here.

RPy is a very simple, yet robust, Python interface to the R Programming Language. It can manage all kinds of R objects and can execute arbitrary R functions (including the graphic functions). All errors from the R language are converted to Python exceptions. Any module installed for the R system can be used from within Python.

Khedive answered 25/1, 2012 at 9:57 Comment(4)
Many thanks for those two recommendations - both warrant my more detailed examination. I've heard about RPy, but never used it, so that recommendation is helpful, too. Are you sure you intended that link for the tutorial... it seems more like an advert to me? The only snag with both is that I can see no examples of the use of the tool with data of a similar structure to mine... i.e. 3 dimensions mapped to a sampled continuous functions. If I've overlooked such an example, I'd appreciate an explicit pointer. :)Bilodeau
I'm afraid I haven't used either package otherwise I'd provide a more specific example, but by definition at least GGobi sounds exactly like the kind of thing you're after - "visualization program for exploring high-dimensional data". You're right about the 'tutorial' I posted - it's really a series of workshops, but they supply all their course material free online which you may find useful.Khedive
I didn't find much of interest in the course material... though I have found a reference to a book: "Interactive and Dynamic Graphics for Data Analysis With Examples Using R and GGobi" tinyurl.com/7qcf6g2 - which I now intend to read. I'm not sure it's the best answer to this problem - but it's definitely relevant in this field. To be honest, I thought it would be far easier to find example visualisations of a form similar to my problem than it has proven.Bilodeau
Animation with Matplotlib is relatively trivial, if all you were after is an animated 3d plot - matplotlib.sourceforge.net/examples/animation/…Khedive
W
3

You might want to look at outputting SVG with animation, in which case this question might interest you. I suspect the animation aspects will require a lot of work on your part. Another option is maybe visualizing the data as a graph, although I'm don't know enough about your data to know whether this would be useful to you. If it is, cytoscape is python scriptable

Windpipe answered 23/1, 2012 at 16:6 Comment(1)
It's the 'lot of work' on my part that I'm trying to avoid. I wouldn't mind putting in that effort if I were convinced that the patterns I'm looking for are present... This is a one-shot deal, for now, and I'll be the only user. I don't know a lot about the structure of the data I plan to visualise. I expect line graphs for each individual sequence (for 3 specific parameters) to approximate continuous functions...and for the surfaces generated by plotting that against any one of the 3 parameters to be relevant.Bilodeau
W
3

If all you want is an animated surface, then gnuplot can do it. A quick intro on it can be found here, or from the gnuplot FAQ. More detail can obviously be found in the gnuplot docs.

Windpipe answered 23/1, 2012 at 16:55 Comment(6)
I'd love to use GNUPlot if I could... and the animated GIF support definitely looks nifty. What I'm unclear about is how I'd use this interactively... as far as I can tell, the animated GIF is produced as a batch process - whereas I'd want to be able to alter the parameters real-time... since, even with animated surfaces, this only deals with 3 of my 4 abstract dimensions.Bilodeau
Hang on, doesn't an animated surface show 4 dimensions (x,y,z,t).Windpipe
I was thinking you could generate gnuplot animations in response to user input and then display them by using subprocess to control gnuplot. Not a fantastic solution, but possibleWindpipe
My data can be represented as a set of tuples (a0,a1,a2,a3,v) - 4 abstract dimensions (all measured in units of time identified by a regular global clock) and one value - a scalar real value approximated by a float. This means, no matter what 3D approach I use (even animated) I have insufficient dimensions to directly map the tuple. I need, therefore, a way to select, say (a0,a1) interactively for a real-time visualisation of (a2,a3,a4,v) as an animated 3D plot. Or, select (a0,a1,a2) for a static surface plot of (a3,a4,v) etc.Bilodeau
Ah sorry, when you said your data was 4D in the bounty summary, I misunderstood. Not sure if others would agree, but I would say this is a 5D problem, because v is a dimension you wish to displayWindpipe
Criticism readily accepted. It's definitely multi-dimensional - though the exact number of dimensions varies depending upon your perspective. A surface plot is a 3D object representing two parameter dimensions (x and y) to a third, the value, (z)... I've four parameter dimensions mapping to a fifth, the value.Bilodeau
W
3

You could try guiqwt. It's aimed for 2D graphs, but targets more specifically interactive plots (as opposed to Matplotlib, although it can handle some degree of interaction too). From the guiqwt documentation:

Overview

Based on PyQwt (plotting widgets for PyQt4 graphical user interfaces) and on the scientific modules NumPy and SciPy, guiqwt is a Python library providing efficient 2D data-plotting features (curve/image visualization and related tools) for interactive computing and signal/image processing application development.

Performances

The most popular Python module for data plotting is currently matplotlib, an open-source library providing a lot of plot types and an API (the pylab interface) which is very close to MATLAB’s plotting interface.

guiqwt plotting features are quite limited in terms of plot types compared to matplotlib. However the currently implemented plot types are much more efficient. For example, the guiqwt image showing function (guiqwt.pyplot.imshow()) do not make any copy of the displayed data, hence allowing to show images which are much larger than with its matplotlib‘s counterpart. In other terms, when showing a 30-MB image (16-bits unsigned integers for example) with guiqwt, no additional memory is wasted to display the image (except for the offscreen image of course which depends on the window size) whereas matplotlib takes more than 600-MB of additional memory (the original array is duplicated four times using 64-bits float data types).

(I haven't tried it, so I can't comment on these claims.)

Whensoever answered 23/1, 2012 at 17:13 Comment(3)
I've already found and installed PyQwt - but I've not come across any obvious strategy to use this library for the sort of plots I've identified above. I'd really like to see an example of something similar to my problem in order to gain confidence in this approach.Bilodeau
I'm not sure if we're talking about the same library, but as far as I'm aware, guiqwt isn't the same as PyQwt.Whensoever
I was imprecise... I've installed both. I've looked (briefly) at this: code.google.com/p/guiqwt - though I can't see how it helps with the problem.Bilodeau
W
3

Okay, now that I understand your data I can definitely suggest a method of visualisation. A coloured 3D surface density plot. Use a0, a1 and a2 as standard x,y,z axes, use a3 as the time axis, and plot different colours over a monochromatic range (or cold to hot). That way the only thing that needs an interactive slider is a3.

As far as tools to do this are concerned

  1. I don't know whether gnuplot can do colour density plots, if it can this is your best bet. Generate an set of gifs across domain of a3, use imagemagick to make a single animated gif out of them, then use an animated .gif editor that allows you to move back and forth between frames
  2. Again, with matplotlib, I'm not certain whether it is possible to do colour density plots
  3. SVG can definitely do everything you need to do, including the animation aspects, but as I've said before, is going to be a lot of hard work.
Windpipe answered 29/1, 2012 at 9:20 Comment(2)
I think there'd be too much going on in such a plot to make sense of the data. :) My main objection to SVG is that I will need something interactive - allowing me to "fiddle" with parameters real-time... and SVG seems very low level if this is to be my goal.Bilodeau
Your trying to visualise 5 dimensional data. It doesn't matter how you do it, it's going to be difficult to make sense of. You can display at most 4 dimensions meaningfully on a screen without colour, and even then 4th dimension (change over time) is often difficult to grasp.Windpipe
A
2

Sounds like Mayavi might fit your needs. It is written in Python, can be used interactively and supports 3D graphs and animations. You can have a look at this tutorial to see if it fits your needs.

I have done an interactive 3D visualization with animation in Python using the older version 1 of mayavi, see this page.


Edit

Unfortunately, most Mayavi examples show off too much advanced functionality. Here are two examples that demonstrate more basic applications. If these two do not fit your needs, then Mayavi may not be a good choice in your case. My understanding is that you have arrays of floats that you want to visualize.

Example 1

Here is a specific example from the older page on what you can do with a 3D array of floats: 3D data example. This example shows the use of isocontour surfaces, one solid cut plane through the data and another cut plane with isocontour lines. You can interactively move the cut planes around or choose different visualization tools. (In my case I had added another dimension and an animation that presented the data as 3D-cube slices through the hypercube.)

Example 2

Here is another example of what a more "conventional" plot with Mayavi could look like: Fourier transform example. This is quite similar to what the many other plotting libraries do.

Advocaat answered 25/1, 2012 at 10:24 Comment(3)
My first impression is that Mayavi is focused mainly on visually impressive renderings, rather than drawing basic line/surface plots in real-time. I'd welcome an example of its use showing the contrary...Bilodeau
Thanks for the examples... though they don't seem to address my key concern- i.e. that I have 3 parameters that I pass to a function to generate a sequence of floats that we might assume we can represent as a line graph. Sure, I can switch to 3D surfaces and have a function taking two parameters - perhaps I can animate to lose another... but there are always going to be too many parameters to encode all on a single plot. So, the key thing I think I will need is the ability to interactively change parameters - and, ideally, switch which parameters I plot directly and which I tweak manually.Bilodeau
@aSteve: Thanks for that clarification. With Mayavi you still need to provide the data you want to visualize. In your case that means you still need to provide the (G)UI that lets you select which parameter set to plot, what should vary and what is kept fixed, how to position the surfaces and what to animate. You need to write those separately in Python, as Mayavi cannot do this part for you. Once you do this, however, you can integrate the full Mayavi engine into your user interface, see e.g. github.enthought.com/mayavi/mayavi/advanced_scripting.htmlAdvocaat
S
0

Go download a free trial of Tableau (www.tableausofware.com). It will encode your data on X, Y, size, color and shape, and you can create small multiples any other dimensions you have -- i.e. you can look at lots of dimensions at once. You can try lots and lots of visualizations very rapidly. There is free training on the company website.

Disclaimer: I work for them.

Swop answered 30/1, 2012 at 8:43 Comment(1)
Can I use this software to plot data I calculate on-the-fly, or can I only use it if I calculate all the data up-front (in a specific format)?Bilodeau
T
0

The simplest visualization for 3+dimensions is bubble chart or motion chart. On top of the x and y axis you can use the bubble size and the bubble color for the extra dimensions.

Google visualization (http://code.google.com/apis/chart/interactive/docs/gallery/motionchart.html) and its google spreadsheet interactive mode give a simple interface to play with which of the dimensions is on which of the axis/size/color.

It is not aimed at handling too many data points, but you can use it to identify patterns on samples of the data with ease.

Trask answered 30/1, 2012 at 19:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.