Oracle preferred columns lengths
Asked Answered
R

1

7

Does the multiplication factor of a column's length somehow influence the database performance?

In other words, what is the difference between the performance of the following two tables:

TBL1:
  - CLMN1 VARCHAR2(63)
  - CLMN2 VARCHAR2(129)
  - CLMN3 VARCHAR2(250)

and

TBL2:
  - CLMN1 VARCHAR2(64)
  - CLMN2 VARCHAR2(128)
  - CLMN3 VARCHAR2(256)

Should we always attempt to make a column's length to some power of 2 or does only the maximum size matter?

Some of the developers claim that there is some link between the multiplication factor of the length of the columns in a database, because it influences how Oracle distributes and saves the data on the disk and shares its cache in memory. Can someone prove or disprove this?

Ripping answered 17/1, 2013 at 8:54 Comment(7)
Have you tried performance testing it?Neckerchief
No. I wonder what is a theoretical explanation of the difference if it is present. Because of many times developers like assigning binary-powered length of columns (64, 128 ...). I wonder: why? Does it really somehow influence?Ripping
It's possible the developers you know are thinking of hash partitioning where it's recommended that the number of partitions is a power of 2Glandule
@Glandule thank you, it is a valuable article. But I don't yet understand: is any link between this partitions and columns length?Ripping
None @Andremoniy, which is why this isn't an answer. I'm guessing that the developers you know might be confusing partitioning with column lengths and therefore (I haven't tested this or seen anything to suggest it's right/wrong, though I believe it to be wrong)Glandule
The maximum size of a VARCHAR2 may impact memory allocation in PL/SQL. PL/SQL will allocate the maximum size up to 4000, then it will allocate the actual size. Which means that 4000 can be a very useful size, even if does raise a lot of eyebrows. If you're only interested in SQL, then the size won't matter because the minimum unit of database I/O is a block.Headgear
@jonearles I'm interesting particular Oracle. Could you please expand your answer?Ripping
B
9

There is no difference in performance. And there are no hidden optimizations done because of power of 2.

The only thing that does make a difference in how things are stored is the actual data. 100 characters stored in a VARCHAR2(2000) column are stored exactly the same way as 100 characters stored in a VARCHAR2(500) column.

Think of the length as a business constraint, not as part of the data type. The only thing that should driver your decision about the length are the business constraints about the data that is put in there.

Edit: the only situation where the length does make a difference, is when you need an index on that column. Older Oracle versions (< 10) did have a limit on the key length and that was checked when creating the index.

Even though it's possible in Oracle 11, it might not be the wisest choice to have an index on a value with 4000 characters.

Edit 2:

So I was curious and setup a simple test:

create table narrow (id varchar(40));
create table wide (id varchar(4000));

Then filled both tables with strings composed of 40 'X'. If there was indeed a (substantial) difference between the storage, this should show up somehow when retrieving the data, right?

Both tables have exactly 1048576 rows.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> set autotrace traceonly statistics
SQL> select count(*) from wide;


Statistics
----------------------------------------------------------
          0  recursive calls
          1  db block gets
       6833  consistent gets
          0  physical reads
          0  redo size
        349  bytes sent via SQL*Net to client
        472  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> select count(*) from narrow;


Statistics
----------------------------------------------------------
          0  recursive calls
          1  db block gets
       6833  consistent gets
          0  physical reads
          0  redo size
        349  bytes sent via SQL*Net to client
        472  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>

So the full table scan for both tables did exactly the same. So what happens when we actually select the data?

SQL> select * from wide;

1048576 rows selected.


Statistics
----------------------------------------------------------
          4  recursive calls
          2  db block gets
      76497  consistent gets
          0  physical reads
          0  redo size
   54386472  bytes sent via SQL*Net to client
     769427  bytes received via SQL*Net from client
      69907  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1048576  rows processed

SQL> select * from narrow;

1048576 rows selected.


Statistics
----------------------------------------------------------
          4  recursive calls
          2  db block gets
      76485  consistent gets
          0  physical reads
          0  redo size
   54386472  bytes sent via SQL*Net to client
     769427  bytes received via SQL*Net from client
      69907  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1048576  rows processed

SQL>

There is a slight difference in consistent gets, but that could be due to caching.

Biracial answered 17/1, 2013 at 8:58 Comment(17)
Ok. So, you claims, that there isn't any influence of columns length on how Oracle distributes database files on the disk and database cache in memory, so multiplication factor has no any sense?Ripping
Only the actual length of the data influences this, not the declared maximum length. (And of course other configuration properties such as PCTFREE)Biracial
This is related to something that has always intrigued me (although, it seems, not enough to actually look it up). So are you saying there really is no difference in performance between a varchar(2000) and a varchar(500)? It would settle a regular point of (good natured) disagreement I have with the woman who runs the Oracle DB where I work...Fugal
@a_horse_with_no_name Did not Oracle do alignment of data structures in memory by the boundary of some multiple values? You think it just puts it all in continuous sequences of bytes without any holes?Ripping
@a_horse_with_no_name I'm not a specialist in this question. So, I can't evaluate your answer, until you not provide some technical links or more detailed information. In any case, if your answer will be most popular and greatly evaluated, of course, I will accept it. Thanks a lot.Ripping
@Andremoniy: Sorry, but why would you doubt of a_horse_with_no_name and not of the persons who tells you the opposite (size has some onfluence)? I mean, according to the reputation of stackoverflow, this person is highly knowledgeable. And you could just do a test yourself no? Create 2 big tables, one with varchar2(270) and another one with varchar2(256) and try your joins, filters and everything.Ipoh
@kevtufc: yes, that's what I'm saying. There is one thing to keep in mind though: indexing. I have edited my question.Biracial
@Andremoniy: not sure about alignment, you might have a point there.Biracial
@Plouf! Dear, I'm not minimizing the authority of a_horse_with_no_name. I just want to wait for another opinions and evaluations. Furthermore, I'd like to have theoretical substantiation of this question. Else if somebody will ask me: why do you think so? - my answer like: "a_horse_with_no_name said so" will not have any sense :)Ripping
@Plouf!, just because someone has high reputation does not necessarily mean that they know what they're talking about. It means they have a lot of experience on Stack Overflow; though in this case a_horse_with_no_name is one of the top all-time answerers in Oracle you should still not just assume that the answer is correct based on reputation alone.Glandule
@Plouf! I totally agree with Ben and Andremoniy ;) After all I could be wrong. Although I never came across any statement that showed me that there is a difference in the storage, but I have seen a lot that claim there is no difference.Biracial
I agree, definetly. That's why I proposed to set up a simple test case. You can't believe someone on the internet. But, asking a question on the internet, and asking some good fellow to prove its point, while letting your officemate just stating things out of nowhere is... a bit unfair. And I went like that since I already tried that (on 11g) some month ago, and knew it doesn't make any difference.Ipoh
@a_horse_with_no_name Excellent! And what about comparison of cases, where in narrow table column length is 127 and in wide is 128. Furthermore, more interesting to collect statistic about select * from ... query (or I'm wrong?).Ripping
@Andremoniy: the second block of statistics is for a select * from .. which did transfer all the rows from the server to the client.Biracial
@a_horse_with_no_name Ops, sorry-sorry, I didn't noticed. But what about 127/128. Could you please do the same test?Ripping
The key length can still throw errors in 11g, but you'll need to use a multi-column index to hit the limit. For example: create table test1(a varchar2(4000), b varchar2(4000)); and create index test1_idx on test1(a, b); throws ORA-01450: maximum key length (6398) exceeded.Headgear
The only other case I'm aware of where the size of VARCHAR2's made any difference was when using them in PL/SQL - the amount of memory allocated to variables based on the columns varied depending on whether they were greater or less than a tipping point (used to be 2000, now it's 4000 I believe).Smarm

© 2022 - 2024 — McMap. All rights reserved.