Efficiently load a large Mat into memory in OpenCV
Asked Answered
R

1

19

Is there a more efficient way to load a large Mat object into memory than the FileStorage method in OpenCV?

I have a large Mat with 192 columns and 1 million rows I want to store locally in a file and load into memory then my application starts. There is no problem using the FileStorage, but I was wondering if there exists a more efficient method to do this. At the moment it takes about 5 minutes to load the Mat into memory using the Debug mode in Visual Studio and around 3 minutes in the Release mode and the size of the data file is around 1.2GB.

Is the FileStorage method the only method available to do this task?

Recrimination answered 1/9, 2015 at 13:27 Comment(0)
B
38

Are you ok with a 100x speedup?


You should save and load your images in binary format. You can do that with the matwrite and matread function in the code below.

I tested both loading from a FileStorage and the binary file, and for a smaller image with 250K rows, 192 columns, type CV_8UC1 I got these results (time in ms):

// Mat: 250K rows, 192 cols, type CV_8UC1
Using FileStorage: 5523.45
Using Raw:         50.0879    

On a image with 1M rows and 192 cols using the binary mode I got (time in ms):

// Mat: 1M rows, 192 cols, type CV_8UC1
Using FileStorage: (can't load, out of memory)
Using Raw:         197.381

NOTE

  1. Never measure performance in debug.
  2. 3 minutes to load a matrix seems way too much, even for FileStorages. However, you'll gain a lot switching to binary mode.

Here the code with the functions matwrite and matread, and the test:

#include <opencv2\opencv.hpp>
#include <iostream>
#include <fstream>

using namespace std;
using namespace cv;


void matwrite(const string& filename, const Mat& mat)
{
    ofstream fs(filename, fstream::binary);

    // Header
    int type = mat.type();
    int channels = mat.channels();
    fs.write((char*)&mat.rows, sizeof(int));    // rows
    fs.write((char*)&mat.cols, sizeof(int));    // cols
    fs.write((char*)&type, sizeof(int));        // type
    fs.write((char*)&channels, sizeof(int));    // channels

    // Data
    if (mat.isContinuous())
    {
        fs.write(mat.ptr<char>(0), (mat.dataend - mat.datastart));
    }
    else
    {
        int rowsz = CV_ELEM_SIZE(type) * mat.cols;
        for (int r = 0; r < mat.rows; ++r)
        {
            fs.write(mat.ptr<char>(r), rowsz);
        }
    }
}

Mat matread(const string& filename)
{
    ifstream fs(filename, fstream::binary);

    // Header
    int rows, cols, type, channels;
    fs.read((char*)&rows, sizeof(int));         // rows
    fs.read((char*)&cols, sizeof(int));         // cols
    fs.read((char*)&type, sizeof(int));         // type
    fs.read((char*)&channels, sizeof(int));     // channels

    // Data
    Mat mat(rows, cols, type);
    fs.read((char*)mat.data, CV_ELEM_SIZE(type) * rows * cols);

    return mat;
}

int main()
{
    // Save the random generated data
    {
        Mat m(1024*256, 192, CV_8UC1);
        randu(m, 0, 1000);

        FileStorage fs("fs.yml", FileStorage::WRITE);
        fs << "m" << m;

        matwrite("raw.bin", m);
    }

    // Load the saved matrix

    {
        // Method 1: using FileStorage
        double tic = double(getTickCount());

        FileStorage fs("fs.yml", FileStorage::READ);
        Mat m1;
        fs["m"] >> m1;

        double toc = (double(getTickCount()) - tic) * 1000. / getTickFrequency();
        cout << "Using FileStorage: " << toc << endl; 
    }

    {
        // Method 2: usign raw binary data
        double tic = double(getTickCount());

        Mat m2 = matread("raw.bin");

        double toc = (double(getTickCount()) - tic) * 1000. / getTickFrequency();
        cout << "Using Raw: " << toc << endl;
    }

    int dummy;
    cin >> dummy;

    return 0;
}
Buyse answered 2/9, 2015 at 15:59 Comment(3)
what is the use of storing the output of channels? you don't seem to use it when reading back the file. Only asking because I am having some trouble with a similar function of my ownAlixaliza
@sturkmen 1) I have production code with this version, and I don't want to brake it 2) the channel information can be useful for loading the image in, say, matlab 3) saving one byte it's not a big deal 4) it's pretty easy to update the code without channels ;)Buyse
Is that possible to use this method for recording purpose? I mean instead of using cv::VideoWriter, it might be very efficient if we could adopt this approach!Infelicitous

© 2022 - 2024 — McMap. All rights reserved.