I am working on a project, where I want to perform data acquisition, data processing and GUI visualization (using pyqt with pyqtgraph) all in Python. Each of the parts is in principle implemented, but the different parts are not well separated, which makes it difficult to benchmark and improve performance. So the question is:
Is there a good way to handle large amounts of data between different parts of a software?
I think of something like the following scenario:
- Acquisition: get data from some device(s) and store them in some data container that can be accessed from somewhere else. (This part should be able to run without the processing and visualization part. This part is time critical, as I don't want to loose data points!)
- Processing: take data from the data container, process it, and store the results in another data container. (Also this part should be able to run without the GUI and with a delay after the acquisition (e.g. process data that I recorded last week).)
- GUI/visualization: Take acquired and processed data from container and visualize it.
- save data: I want to be able to store/stream certain parts of the data to disk.
When I say "large amounts of data", I mean that I get arrays with approximately 2 million data points (16bit) per second that need to be processed and possibly also stored.
Is there any framework for Python that I can use to handle this large amount of data properly? Maybe in form of a data-server that I can connect to.
pandas
(pandas.pydata.org)? MVC is about separation of data and its representation in GUIs, and pandas is about processing large amount of data. I hope it will be a start for you! – Considerate