SQLiteBlobTooBigException still occurs after dividing the request in chunks of 1 MB (and cursor.close())
Asked Answered
H

1

1

It is possible to import a text file of 6MB directly into my database. However, the text can't be extracted since CursorWindow has a limit of 2MB. (I should have used files, but some users already have this problem and I would need to read the entire text to be able to put it in a file) I used substr (a special SQL function) to only request 1 MB and it worked. However, the following while loop doesn't work after the second iteration (which means that even if I called cursor.close(), the CursorWindow wasn't emptied, so for the first iteration it only has 1MB, but after the second it has 2MB and the exception SQLiteBlobTooBigException is thrown):

        //Load in chunks
        BookDbHelper bookDbHelper = new BookDbHelper(GlobalContext.get());
        SQLiteDatabase readableDatabase = bookDbHelper.getReadableDatabase();
        //Query length
        int chunk_size = (int) Math.pow(2, 20);//mb
        String query_length = "SELECT _id, length(text) FROM " + BookContract.TABLE_NAME + " WHERE _id=" + id;
        Cursor cursor_length = readableDatabase.rawQuery(query_length, null);
        cursor_length.moveToFirst();
        int length = cursor_length.getInt(1);
        cursor_length.close();
        bookDbHelper.close();
        readableDatabase.close();
        //Query text
        int numSteps = length / chunk_size + 1;
        int i = 0;
        while(i < numSteps) {
            BookDbHelper bookDbHelper2 = new BookDbHelper(GlobalContext.get());
            SQLiteDatabase readableDatabase2 = bookDbHelper2.getReadableDatabase();
            int from = i * chunk_size + 1;
            int to = (i + 1) * chunk_size + 1;
            //L.v(from + ", " + to);
            String query = "SELECT _id, substr(text," + from + "," + to + ") FROM " + BookContract.TABLE_NAME + " WHERE _id=" + id;
            Cursor cursor = readableDatabase2.rawQuery(query, null);
            //Read
            cursor.moveToFirst();
            String string = cursor.getString(1);
            cursor.close();
            bookDbHelper2.close();
            readableDatabase2.close();
            //stringBuilder.append(string);
            i++;
        }

The relevant columns are _id and text (which contains a very large string), and the relevant sql functions are length() (to know the number of iterations necessary) and substr() (so that the SQLiteBlobTooBigException doesn't occur right away since the 2MB limit wasn't reached).

I tried closing bookDbHelper and readableDatabase and it didn't help.

How can I force CursorWindow to close so that I make a request of 1MB, empty the CursorWinow, and continue to make other requests?

Haff answered 15/12, 2019 at 22:55 Comment(0)
U
2

How can I force CursorWindow to close so that I make a request of 1MB, empty the CursorWinow, and continue to make other requests?

I don't believe that closing the Cursor is your issue, as if the by not closing the Cursor appends to the Cursor and expands.

Rather your issue is with the query that you build.

In short the substr function is not from to, it is from for (for being the size/length of the returned string. Your calculation is based upon the 2nd value being the offset of the character). As such the extracted string's length is increased by the chunk size until it exceeds the end of the string (blew the CursorWindow before this) when it reduces.

So the second chunk using 1MB (if looked at as using offsets) was doomed to failure on the 2nd run as it's actually the length (2MB) to extract. Decreasing to less than 1MB would allow some leeway but potentially blow the CursorWindow (but get additional data).

However, as an alternative that uses a single cursor with each chunk as an extarcted row. The soution could be :-

    //Load in chunks
    BookDbHelper bookDbHelper = new BookDbHelper(/*GlobalContext.get()*/this);
    SQLiteDatabase readableDatabase = bookDbHelper.getReadableDatabase();
    //Query length
    StringBuilder wholeBookText = new StringBuilder();
    int chunk_size = (int) Math.pow(2, 20);//mb
    String query_length = "SELECT length(text) FROM " + BookContract.TABLE_NAME + " WHERE _id=?";
    Cursor cursor = readableDatabase.rawQuery(query_length, new String[]{String.valueOf(id)});
    int length = 0;
    if (cursor.moveToFirst()) {
        length = cursor.getInt(0);
    }
    int numSteps = length / chunk_size + 1;
    int i = 0;
    Log.d("BOOKINFO", "Length of Text is " + length + " Number of Chunks = " + numSteps + " Chunk Size = " + chunk_size);

    StringBuilder sb = new StringBuilder();
    for (i=1; i < length + 1; i+= chunk_size) {
        if (sb.length() > 1) sb.append(" UNION ALL ");
        sb.append("SELECT substr(text,")
                .append(String.valueOf(i)).append(",").append(String.valueOf(chunk_size))
                .append(") FROM ").append(BookContract.TABLE_NAME)
                .append(" WHERE _id=").append(String.valueOf(id));

    }
    sb.append(";");
    Log.d("BOOKINFOV2","SQL generated :-\n\t" + sb.toString());
    cursor = readableDatabase.rawQuery(sb.toString(),null);
    wholeBookText = new StringBuilder();
    while (cursor.moveToNext()) {
        wholeBookText.append(cursor.getString(0));
        Log.d("BOOKINFO","Obtained String who's length is " + cursor.getString(0).length() + "\n\tTotal Extracted = " + wholeBookText.length());
    }

Rather than indivudal queries run in a loop. This generates a query that extracts each chunk as a row. That is it makes a UNION between all the queries. e.g.

SELECT substr(text,1,1048576) FROM book WHERE _id=4 
    UNION ALL SELECT substr(text,1048577,1048576) FROM book WHERE _id=4 
    UNION ALL SELECT substr(text,2097153,1048576) FROM book WHERE _id=4 
    UNION ALL SELECT substr(text,3145729,1048576) FROM book WHERE _id=4;
  • taken from a test run of the above.
  • as can be seen the to (should be for) is the chunk size. The last chunk will be truncated according to the remaining data.

The full output from the test run :-

2019-12-16 14:21:35.546 D/BOOKINFOV2: SQL generated :-
        SELECT substr(text,1,1048576) FROM book WHERE _id=4 UNION ALL SELECT substr(text,1048577,1048576) FROM book WHERE _id=4 UNION ALL SELECT substr(text,2097153,1048576) FROM book WHERE _id=4 UNION ALL SELECT substr(text,3145729,1048576) FROM book WHERE _id=4;
2019-12-16 14:21:35.555 W/CursorWindow: Window is full: requested allocation 1048577 bytes, free space 1048128 bytes, window size 2097152 bytes
2019-12-16 14:21:35.585 D/BOOKINFO: Obtained String who's length is 1048576
        Total Extracted = 1048576
2019-12-16 14:21:35.599 W/CursorWindow: Window is full: requested allocation 1048577 bytes, free space 1048128 bytes, window size 2097152 bytes
2019-12-16 14:21:35.616 D/BOOKINFO: Obtained String who's length is 1048576
        Total Extracted = 2097152
2019-12-16 14:21:35.653 D/BOOKINFO: Obtained String who's length is 1048576
        Total Extracted = 3145728
2019-12-16 14:21:35.654 D/BOOKINFO: Obtained String who's length is 51
        Total Extracted = 3145779
  • As you can see the CursorWindow would overflow, but the row is not added, next time around it is added and accessible.

Of course you could adapt the multiple query approach in which case the code could be :-

    //Load in chunks
    BookDbHelper bookDbHelper = new BookDbHelper(/*GlobalContext.get()*/this);
    SQLiteDatabase readableDatabase = bookDbHelper.getReadableDatabase();
    //Query length
    StringBuilder wholeBookText = new StringBuilder();
    int chunk_size = (int) Math.pow(2, 19);//mb
    chunk_size = (1024 * 1024);
    String query_length = "SELECT length(text) FROM " + BookContract.TABLE_NAME + " WHERE _id=?";
    Cursor cursor = readableDatabase.rawQuery(query_length, new String[]{String.valueOf(id)});
    int length = 0;
    if (cursor.moveToFirst()) {
        length = cursor.getInt(0);
    }
    int numSteps = length / chunk_size + 1;
    int i = 0;
    Log.d("BOOKINFO", "Length of Text is " + length + " Number of Chunks = " + numSteps + " Chunk Size = " + chunk_size);

    int from = 1, to = chunk_size;
    while (i < numSteps && length > 0) {
        if (to > length) to = length;
        String query = "SELECT substr(text," + from + "," + (chunk_size) + ") FROM " + BookContract.TABLE_NAME + " WHERE _id=?";
        Log.d("BOOKINFOSQL",query);
        cursor.close();
        cursor = readableDatabase.rawQuery(query, new String[]{String.valueOf(id)});
        //Read
        if (cursor.moveToFirst()) {
            wholeBookText.append(cursor.getString(0));
            Log.d("BOOKINFO","Obtained String who's length is " + cursor.getString(0).length() + "\n\tTotal Extracted = " + wholeBookText.length());
        }
        cursor.close();
        i++;
        from = (i * chunk_size) + 1;
        to = from + chunk_size;
    }
    if (!cursor.isClosed()) {
        cursor.close();
    }
    Log.d("BOOKINFO", "The length of the extracted data is " + wholeBookText.length());

The above results in :-

2019-12-16 14:16:15.336 D/BOOKINFO: Length of Text is 3145779 Number of Chunks = 4 Chunk Size = 1048576
2019-12-16 14:16:15.336 D/BOOKINFOSQL: SELECT substr(text,1,1048576) FROM book WHERE _id=?
2019-12-16 14:16:15.358 D/BOOKINFO: Obtained String who's length is 1048576
        Total Extracted = 1048576
2019-12-16 14:16:15.358 D/BOOKINFOSQL: SELECT substr(text,1048577,1048576) FROM book WHERE _id=?
2019-12-16 14:16:15.382 D/BOOKINFO: Obtained String who's length is 1048576
        Total Extracted = 2097152
2019-12-16 14:16:15.383 D/BOOKINFOSQL: SELECT substr(text,2097153,1048576) FROM book WHERE _id=?
2019-12-16 14:16:15.409 D/BOOKINFO: Obtained String who's length is 1048576
        Total Extracted = 3145728
2019-12-16 14:16:15.409 D/BOOKINFOSQL: SELECT substr(text,3145729,1048576) FROM book WHERE _id=?
2019-12-16 14:16:15.418 D/BOOKINFO: Obtained String who's length is 51
        Total Extracted = 3145779
2019-12-16 14:16:15.418 D/BOOKINFO: The length of the extracted data is 3145779
Unabridged answered 16/12, 2019 at 0:25 Comment(5)
Not sure I understand where the for in "from for" comes from. The text is about 6.3MB, and if I do from 6MB+1 to 7MB+1 it works, so the query itself is fine. The problem occurs when I do two such queries of 1MB in a row. I am not sure why using "UNION" makes a difference (shouldn't the CursorWindow still exceed 2MB?), but I will try it. Thank you,Recension
("So the second chunk using 1Mb was doomed to failure on the2nd run." Why? Shouldn't cursor.close() free/empty the CursorWindow?)Recension
substr(2,4) is from char 2 for 4 characters (4 is size not offset) so from character 2 to character 6. As you were calculating the offset the 2nd was for 2MB, 3rd would have been for 3Mb and so on. So 2nd substr(1048576,2097153) (i.e.` 2 * 1048576 + 1` as per int to = (i + 1) * chunk_size + 1;) is doomed as 2MB would be exceeded. The issue and solution has nothing to do with using UNION or closing the Cursor. The data in a chunk was progressively increasing. Perhaps run yours with logging as above, you'd get 1,1048577 then 1048577, 2097153 ..Unabridged
(Oh I see, you are correct. Thank you for your help!)Recension
On Android that cursor window is limited to 2 MB. If these are text files, just link them, in order to skip all the processing. An index for several books, authors and publishing dates would rather be suitable for SQLite.Windrow

© 2022 - 2024 — McMap. All rights reserved.