Why are positional queries bad?

G

9

7

I'm reading CJ Date's SQL and Relational Theory: How to Write Accurate SQL Code, and he makes the case that positional queries are bad — for example, this INSERT:

INSERT INTO t VALUES (1, 2, 3)

Instead, you should use attribute-based queries like this:

INSERT INTO t (one, two, three) VALUES (1, 2, 3)

Now, I understand that the first query is out of line with the relational model since tuples (rows) are unordered sets of attributes (columns). I'm having trouble understanding where the harm is in the first query. Can someone explain this to me?

Governorship answered 4/7, 2009 at 20:57 Comment(0)

O

20

The first query breaks pretty much any time the table schema changes. The second query accomodates any schema change that leaves its columns intact and doesn't add defaultless columns.

People who do SELECT * queries and then rely on positional notation for extracting the values they're concerned about are software maintenance supervillains for the same reason.

Owner answered 4/7, 2009 at 21:1 Comment(1)

Also, significantly, positional queries are not congruent with the relational model. A true relation's attributes are orderless and positional queries depend on the table's columns having an order. Thus, if you're performing a positional query, you're not performing it on a true relation. – Vyborg 2/11, 2010 at 18:2

C

9

While the order of columns is defined in the schema, it should generally not be regarded as important because it's not conceptually important.

Also, it means that anyone reading the first version has to consult the schema to find out what the values are meant to mean. Admittedly this is just like using positional arguments in most programming languages, but somehow SQL feels slightly different in this respect - I'd certainly understand the second version much more easily (assuming the column names are sensible).

Cercaria answered 4/7, 2009 at 21:3 Comment(0)

B

5

I don't really care about theoretical concepts in this regard (as in practice, a table does have a defined column order). The primary reason I would prefer the second one to the first is an added layer of abstraction. You can modify columns in a table without screwing up your queries.

Baud answered 4/7, 2009 at 21:1 Comment(2)

However, somebody who is unaware that the first query exist could add a query in the middle of the first two, and mess up the query. – Icelandic 4/7, 2009 at 22:18

@Kibbee: ... and that's why it shouldn't exist. – Baud 4/7, 2009 at 22:25

P

2

You should try to make your SQL queries depend on the exact layout of the table as little as possible.

The first query relies on the table only having three fields, and in that exact order. Any change at all to the table will break the query.

The second query only relies on there being those three felds in the table, and the order of the fields is irrelevant. You can change the order of fields in the table without breaking the query, and you can even add fields as long as they allow null values or has a default value.

Although you don't rearrange the table layout very often, adding more fields to a table is quite common.

Also, the second query is more readable. You can tell from the query itself what the values put in the record means.

Pameliapamelina answered 4/7, 2009 at 21:16 Comment(0)

M

2

Something that hasn't been mentioned yet is that you will often be having a surrogate key as your PK, with auto_increment (or something similar) to assign a value. With the first one, you'd have to specify something there — but what value can you specify if it isn't to be used? NULL might be an option, but that doesn't really fit in considering the PK would be set to NOT NULL.

But apart from that, the whole "locked to a specific schema" is a much more important reason, IMO.

Mariken answered 4/7, 2009 at 22:21 Comment(0)

A

1

SQL gives you syntax for specifying the name of the column for both INSERT and SELECT statements. You should use this because:

Your queries are stable to changes in the column ordering, so that maintenance takes less work.
The column ordering maps better to how people think, so it's more readable. It's more clear to think of a column as the "Name" column rather than the 2nd column.

Alexandria answered 4/7, 2009 at 21:28 Comment(0)

B

1

I prefer to use the UPDATE-like syntax:

INSERT t SET one = 1 , two = 2 , three = 3

Which is far easier to read and maintain than both the examples.

Bainbridge answered 4/7, 2009 at 22:15 Comment(3)

I don't believe this is cross-platform - at least, T-SQL doesn't seem to support it. – Mariken 4/7, 2009 at 22:18

Another reason not to use T-SQL. ;) Looks like this is primarily a MySQL extension. It's a pity other DBMSs haven't added support for it. – Bainbridge 4/7, 2009 at 23:2

Language evolution is fairly slow, especially for languages that have been around for a while like SQL. It's too bad, I like your example. – Alexandria 4/7, 2009 at 23:15

C

1

Long term, if you add one more column to your table, your INSERT will not work unless you explicitly specify list of columns. If someone changes the order of columns, your INSERT may silently succeed inserting values into wrong columns.

Carabiniere answered 4/7, 2009 at 22:20 Comment(0)

C

0

I'm going to add one more thing, the second query is less prone to error orginally even before tables are changed. Why do I say that? Becasue with the seocnd form you can (and should when you write the query) visually check to see if the columns in the insert table and the data in the values clause or select clause are in fact in the right order to begin with. Otherwise you may end up putting the Social Security Number in the Honoraria field by accident and paying speakers their SSN instead of the amount they should make for a speech (example not chosen at random, except we did catch it before it actually happened thanks to that visual check!).

Cradling answered 22/12, 2010 at 15:31 Comment(0)

Recommended topics

Hot tags