Actually I'm using R + Python with RPY2 to manipulate data and ggplot to create beautiful graphics.. I have some data in a PostgreSQL database, and I'm using psycopg2 to query data.
I'm starting a thesis, and in the future I need an OLAP cube to store my (very big) simulation data: multiple dimension, aggregation query, etc.
Is there any best or standard practice for interfacing between Python (and I want Python + R, no jpivot or some other dashboard in Java) and an OLAP engine like Mondrian? I searched on Google for any solution, and didn't I find anything.
I've briefly evaluated SQLAlchemy, and Django-ORM, but they have no MDX or XML/A interface to query an OLAP server (Mondrian or other) ...
Is it possible to write a query in MDX and, with psycopg + ODBC, query my OLAP server, and the OLAP server giving me an answer from my simulation data (no mapping on Python object, but it's OK for me)?
Update 1 :
Why do I need to search around OLAP + Mondrian technology ?
Because University of Laval (GeoSoa departements + Thierry Badard) wrote a spatial extension to OLAP: SOLAP, and implemented this in Mondrian as GeoMondrian. That interest me because I'm working on spatial multi agent based simulation ( ~= geosimulation).
The GeoSoa departement created an Ajax based component to communicate and visualize spatial data with GeoMondrian: SOLAPLAYERS, which can query a Mondrian server by its Xlma servlet.
Problem : probably slow in big data manipulation, need Internet or Apache 2. Briefly, it's only to visualize data or map ... In my case, I need raw data to make my own data manipulation + graphics with R: spatial analysis, regression analysis, rank-tail, etc. Here, SOLAP help me to prepare data for this later complex R analysis.
Why Python?
1 - Web access to spatial data -
I'm trying to use a "cool" Python framework, like GeoDjango or MapFish: big community in GIS, open-source, use GeoAlchemy to manipulate spatial query/data, include visualisation with JavaScript extensions and OpenLayers, etc.
2 - Local access to spatial data in GIS -
I want to create a plugin in QGIS (open source GIS) to access and visualize data, and QGIS plugin and API = Python.
3 - Automatic analysis of data -
A user or scientist runs a simulation with grid computing and choose automatic analysis (R + ggplot2 + MDX query) they want to run on this data. My goal here is to create a synthetic report of the simulation (graphic, tabular data, etc.).
So, after simulation, data go to OLAP/SOLAP cube, and many Python scripts (created by the user) get data with MDX, manipulate data with R + RPY2, and write and produce cool output for the scientist on doku-wiki or another community-platform.
Problem?
1 - Olap4j, the API core of Mondrian to communicate with an external component, is Java-made :/
2 - SOLAPLAYERS uses Ajax to access data, too slow for me.
3 - SQLAlchemy and GeoAlchemy have no driver connection to a multidimensional database (OLAP).
* Solution? *
1 - Py4j to access Java object or Java collection in olap4j with Python? Write my own function to access the Java mapped collection? => dangerous and not very easy?...
2 - XLMA with Ajax Mondrian server? It is too slow.
3 - Write my own py-connector to OLAP Mondrian ? => Ouch. It's an hard way, I think.
What should I do?