h2 default DECIMAL precision performance
Asked Answered
D

2

5

When I create DECIMAL column with default precision setting, in H2 web console this column is defined as DECIMAL(65535, 32767), command "show columns from ..." gives me DECIMAL(65535).

Decimal data type in H2 is mapped to BigDecimal (from H2 documentation), but I'm not sure how H2 deals with it.

Are there any performance gains, if I use smaller precision?

Deciliter answered 29/4, 2013 at 9:7 Comment(1)
DO you really need 65535 integer digits and 32767 fraction digits? Is that what you intended by this declaration?Inoperable
J
7

To get to know more about the DECIMAL datatype of H2, inspect the org.h2.value.ValueDecimal class that comes with the h2-x.y.z.jar.

Taking a closer look you'll see the default values are as you mentioned:

/** The default precision for a decimal value. */
static final int DEFAULT_PRECISION = 65535;
/** The default scale for a decimal value.     */
static final int DEFAULT_SCALE = 32767;

Looking closer in ValueDecimal:

private final BigDecimal value;

and org.h2.store.Data:

public Value readValue() {
    ...
    case Value.DECIMAL: {
        int scale = readVarInt();
        int len = readVarInt();
        byte[] buff = DataUtils.newBytes(len);
        read(buff, 0, len);
        BigInteger b = new BigInteger(buff);
        return ValueDecimal.get(new BigDecimal(b, scale));
    }

You can see that a DECIMAL is nothing more than a BigDecimal. That is, all the performance issues you'll face with java.math.BigDecimal, you'll face with DECIMAL.

If you are really into it, you can study the class further and see what exact role the precision/scale play.

If we go to the docs, all H2 say about DECIMAL data types and performance is:

The DECIMAL/NUMERIC type is slower and requires more storage than the REAL and DOUBLE types.

So they say that is a fact.

But since you are talking about performance, we can cut to the chase and do some tests. The test class' code is below, let's get to the output/results:

TYPE              INSERT time    COUNT() time   .db Size (kb)  
DECIMAL(20,2)     6.978          0.488          27958.0        
DECIMAL(100,2)    4.879          0.407          25648.0        
DECIMAL(100,80)   8.794          0.868          90818.0        
DECIMAL(60000,2)  4.388          0.4            25104.0        
DECIMAL(1000,900) 112.905        6.549          1016534.0      
REAL              5.938          0.318          22608.0        
DOUBLE            6.985          0.416          25088.0    

As you can see, there is no noticeable change in the time or storage size when the precision changes (precision 20 takes roughly as much time/size as 60000!).

The thing is when you change the scale. This is what you should worry about; as you can see, DECIMAL(100,2) and DECIMAL(100,80) show a great increase in both time and storage.

DECIMAL(1000,900) takes more than 1 gigabyte(!!!) storing absolutely the same values.

Finally, in the test above, REAL and DOUBLE don't seem to be so much better than DECIMAL (they may even seem worse). But try changing the number of lines inserted (the for loop in the test method), the bigger the number, the better they seem to respond.

* DECIMAL(20,2) seems to be slower/larger than the rest. That is not real. Actually, whatever you choose to run first is going to be slightly slower/larger. Go figure...

public class Main {
    public static void main(String[] a) throws Exception {
        Class.forName("org.h2.Driver");
        System.out.format("%-18s%-15s%-15s%-15s", "TYPE", "INSERT time", "COUNT() time", ".db Size (kb)");
        System.out.println();
        testPerformance("TEST_DECIMAL_20_2",     "DECIMAL(20,2)");
        testPerformance("TEST_DECIMAL_100_2",    "DECIMAL(100,2)");
        testPerformance("TEST_DECIMAL_100_80",   "DECIMAL(100,80)");
        testPerformance("TEST_DECIMAL_60000_2",  "DECIMAL(60000,2)");
        testPerformance("TEST_DECIMAL_1000_900", "DECIMAL(1000,900)");
        testPerformance("TEST_REAL",             "REAL");
        testPerformance("TEST_DOUBLE",           "DOUBLE"); 
    }

    private static void testPerformance(String dbName, String type) throws SQLException {
        System.out.format("%-18s", type);
        Connection conn = DriverManager.getConnection("jdbc:h2:" + dbName, "sa", "");
        conn.createStatement().execute("DROP TABLE IF EXISTS TEST;");
        conn.createStatement().execute("CREATE TABLE TEST (DECTEST " + type +" )");
        long insertStartTime = System.currentTimeMillis();
        for (int i = 0; i < 1000000; i++) {
            conn.createStatement().execute("INSERT INTO TEST (DECTEST) VALUES (12345678901234.45)");    
        }
        double insertTime = ((double)(System.currentTimeMillis()-insertStartTime))/1000;
        System.out.format("%-15s", insertTime+"");
        long countStartTime = System.currentTimeMillis();
        conn.createStatement().executeQuery("select COUNT(DECTEST) from TEST");
        double countTime = ((double)(System.currentTimeMillis()-countStartTime))/1000;
        System.out.format("%-15s", countTime+"");
        conn.close();
        double fileSize = (double)new File(dbName+".h2.db").length() / 1024;
        System.out.format("%-15s", fileSize+"");
        System.out.println();
    }
}
Jillion answered 30/4, 2013 at 4:16 Comment(0)
C
3

The precision and scale of DECIMAL is optional in H2. If you create the column without specifying precision / scale, then performance is not affected, storage is not affected, and you can insert any value and will get the same value back:

create table test(data decimal);
insert into test values(1.345);
select * from test;
> 1.345
Chaunce answered 30/4, 2013 at 5:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.