Should I be concerned that ORMs, by default, return all columns?
Asked Answered
D

7

6

In my limited experience in working with ORMs (so far LLBL Gen Pro and Entity Framework 4), I've noticed that inherently, queries return data for all columns. I know NHibernate is another popular ORM, and I'm not sure that this applies with it or not, but I would assume it does.

Of course, I know there are workarounds:

  • Create a SQL view and create models and mappings on the view
  • Use a stored procedure and create models and mappings on the result set returned

I know that adhering to certain practices can help mitigate this:

  • Ensuring your row counts are reasonably limited when selecting data
  • Ensuring your tables aren't excessively wide (large number of columns and/or large data types)

So here are my questions:

  1. Are the above practices sufficient, or should I still consider finding ways to limit the number of columns returned?

  2. Are there other ways to limit returned columns other than the ones I listed above?

  3. How do you typically approach this in your projects?

Thanks in advance.

UPDATE: This sort of stems from the notion that SELECT * is thought of as a bad practice. See this discussion.

Designedly answered 3/3, 2011 at 5:16 Comment(4)
Is there a performance problem? Some frameworks offer finer-grained control, but for most records I suspect the queries are more "costly" (for normal cases) than extracting/returning the data itself -- of course, with a large data field... things start to change.Chucho
See my update above. I've always heard of SELECT * as a bad practice. Of course, I know this is a bit different, as part of this is because it returns all rows even when your table changes -- and ORMs won't suffer from this problem. But the other argument is bandwidth consumption, which does apply to ORMs.Designedly
LLBLGen allows you to remove columns from your entity. You can also have different versions of the same entity (so a full version and a "lean" version for example).Maenad
Also, LLBLGen allows you to utilize an IncludeFields list or ExcludeFields list. However, you still get back all of the keys (which makes sense).Farber
S
9

One of the reasons to use an ORM of nearly any kind is to delay a lot of those lower-level concerns and focus on the business logic. As long as you keep your joins reasonable and your table widths sane, ORMs are designed to make it easy to get data in and out, and that requires having the entire row available.

Personally, I consider issues like this premature optimization until encountering a specific case that bogs down because of table width.

Stadia answered 3/3, 2011 at 5:22 Comment(0)
C
2

First of : great question, and about time someone asked this! :-)

Yes, the fact an ORM typically returns all columns for a database table is something you need to take into consideration when designing your systems. But as you've mentioned - there are ways around this.

The main fact for me is to be aware that this is what happens - either a SELECT * FROM dbo.YourTable, or (better) a SELECT (list of all columns) FROM dbo.YourTable.

This is not a problem when you really want the whole object and all its properties, and as long as you load a few rows, that's fine, too - the convenience beats the raw performance.

You might need to think about changing your database structures a little bit - things like:

  • maybe put large columns like BLOBs into separate tables with a 1:1 link to your base table - that way, a select on the parent tables doesn't grab all those large blobs of data

  • maybe put groups of columns that are optional, that might only show up in certain situations, into separate tables and link them - again, just to keep the base tables lean'n'mean

Also: avoid trying to "arm-wrestle" your ORM into doing bulk operations - that's just not their strong point.

And: keep an eye on performance, and try to pick an ORM that allows you to change certain operations into e.g. stored procedures - Entity Framework 4 allows this. So if the deletes are killing you - maybe you just write a Delete stored proc for that table and handle that operation differently.

Clipfed answered 3/3, 2011 at 5:51 Comment(0)
D
1

The question here covers your options fairly well. Basically you're limited to hand-crafting the HQL/SQL. It's something you want to do if you run into scalability problems, but if you do in my experience it can have a very large positive impact. In particular, it saves a lot of disk and network IO, so your scalability can take a big jump. Not something to do right away though: analyse then optimise.

Defazio answered 3/3, 2011 at 5:27 Comment(0)
C
1

Are there other ways to limit returned columns other than the ones I listed above?

NHibernate lets you add projections to your queries so you wouldn't need to use views or procs just to limit your columns.

Cerebrovascular answered 3/3, 2011 at 6:31 Comment(0)
W
0

For me this has only been an issue if the tables has LOTS of columns > 30 or if the column had alot of data for example a over 5000 character in a field.

The approach I have used is to just map another object to the existing table but with only the fields I need. So for a search that populates a table with 100 rows I would have a MyObjectLite, but when I click to view the Details of that Row I would call a GetById and return a MyObject that has all the columns.

Another approach is to use custom SQL, Stroed procs but I only think you should go down this path if you REALLY need the performance gain and have users complaining. SO unless there is a performance problem do not waste your time trying to fix a problem that does not exist.

Whorehouse answered 3/3, 2011 at 5:24 Comment(0)
I
0

You can limit number of returned columns by using Projection and Transformers.AliasToBean and DTO here how it looks in Criteria API:

.SetProjection(Projections.ProjectionList()
    .Add(Projections.Property("Id"), "Id")
    .Add(Projections.Property("PackageName"), "Caption"))
.SetResultTransformer(Transformers.AliasToBean(typeof(PackageNameDTO)));
Insomnia answered 3/3, 2011 at 14:8 Comment(0)
B
0

In LLBLGen Pro, you can return Typed Lists which not only allow you to define which fields are returned but also allow you to join data so you can pull a custom list of fields from multiple tables.

Overall, I agree that for most situations, this is premature optimization.

One of the big advantages of using LLBLGen and other ORMs as well (I just feel confident speaking about LLBLGen because I have used it since its inception) is that the performance of the data access has been optimized by folks who understand the issues better than your average bear.

Whenever they figure out a way to further speed up their code, you get those changes "for free" just by re-generating your data layer or by installing a new dll.

Unless you consider yourself an expert at writing data access code, ORMs probably improve most developers efficacy and accuracy.

Beleaguer answered 8/9, 2011 at 2:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.