I read this post last night, and I noticed it was from 2006. I could go either way on the ORM, database thing, but I was just wondering if everything bad Jeff said about ORM still applies even now considering the post is from 2006.
It's still true.
Even more than OO software, the database suffers if it isn't treated precisely the way intended. And it wasn't intended that you should interpose some abstraction layer in front of it.
I think of impermeable abstraction layers as trying to build a Lego castle with all the pieces closed up into a pillowcase. SQL is damn hard to do correctly. It doesn't share many patterns with procedural programming, and best practices for one can be the opposite for the other. You need to be able to grok every single item in a SQL statement, and have a pretty good idea what it's intended to do, and what it in fact does.
Lots of people seem to think that, like horseshoes, close is good enough - if the right answer pops out, that implies you're nearly there. In SQL, that's simply not true.
RoR and the ActiveRecord pattern have deservedly earned a reputation as dbms resource hogs for this reason. Optimized ActiveRecord design is more often than not suboptimal SQL design, because it encourages SQL statement decomposition.
Yes.
Object-oriented is still object-oriented and Relational is still Set-oriented. Nothing has changed in these two paradigms in the past two years to make them work better together.
In many people's eyes, SQL is ugly, complex, and confusing. But trying to make an object-oriented interface to perform the same functionality is always uglier, more complex, and has a steeper learning curve.
In all programming, there is a tradeoff between flexibility and assumptions. Frameworks (such as Rails) try to solve the problem by being "opinionated." That is, they limit flexibility of either the relational or the object-oriented aspects of the problem, making assumptions about how data is structured, and what operations you can do with it. Naturally, simplifying the problem space makes the solution simpler as well.
In addition, it's frustrating to discover that an ORM framework is incomplete, so some ordinary operations in SQL have no solution in a given ORM. This is also a consequence of "opinionated" frameworks.
A lot of web 2.0 companies are working on key-value stores. And all these companies have to go through the same painfull process of making it work.
If ORM is the "vietnam of computer science" then building your own key-value store is probably the "Iraq of computer science" :-)
I can only speak for my experience. I use DAL and DTO's, and I am still capable of performing pretty complex queries (joins and all), and also I am able to resort to SP's or custom SQL whenever I need. It made my life easier, my code consistent and my deadlines more attainable.
I think starting from the assumption that Jeff's conclusions are correct is not necessarily good; having maintained stored procedure code as well as JDBC-based data layers, I can say that these caused maintenance problems galore, mostly related to the inability to understand what was going on at a higher level.
A database is necessarily low-level; it stores numbers and strings essentially. Business logic is high-level. This is why we have abstraction.
Personally, I think the Rails/ActiveRecord way is the best solution to having an object/domain model but also being able to take advantage of a relational database.
So: don't throw out ORM, but don't default to it either. It's a tool that solves certain problems. To ignore it would be ignorant and to always use it would be arrogant.
Jeffs article links through to Ted Newards article. If you are into the details then you need to look there:
original - http://blogs.tedneward.com/post/the-vietnam-of-computer-science/
followup - http://blogs.tedneward.com/post/thoughts-on-vietnam-commentary/
Of Ted's original points I have it as:
- 1 was wrong (Identity)
- 2 of them solved (Partial objects and N + 1)
- 2 are debatable (Dual schema / shared schema).
Disclaimer: I'm the author of Ebean ORM so I'll reference that for the various 'solutions' to the issues raised.
Ted's original points (distilled because it's really wordy):
1. Partial Object problem.
Always solvable. Ebean ORM made partial objects fundemental to it's query language and all internals. JPQL didn't make this a priority to it's more a problem there unfortunately.
2. N + 1 (Ted's Load time paradox)
Always solvable. Should have been written as 1 + N / batchSize
but it is more interesting than that (per path, need to take into account SQL paging, avoid sql cartesian product). Some ORM's make a right mess of this unfortunately and that brings ORM in general into disrepute. Some ORM's are ok until you hit a level of complexity (like OneToMany inside OneToMany inside OneToMany).
Just to up the ante here ORM's can profile the object graph use and automatically optimise the query (only fetching what is needed, defining fetch paths to optimise for N + 1 etc).
This automatic ORM query optimisation idea came out of the University of Texas (using Hibernate). It was included as part of Ebean ORM in 2008 so it's been around for a while now.
3. Identity
Ted cracks on about a mismatch on identity and concurrency. This point is misplaced as ORMs (well, all the ones I know) go about this aspect in exactly the same manor as the prior client/server tools and specifically ORM's are providing a SNAPSHOT view of part of the database to the application. There was never a problem here but ORM implementations could get themselves into strife with an over reliance on hashCode()/equals() for example.
4. Dual schema problem
This is debatable. If the organisation allows then the ORM can provide a DIFF/SQL script to the schema and that is run by FlywayDB/Liquibase etc. If organisations don't allow that this might still be an issue to some extent.
5. DB Refactoring / Shared schema
This is debatable. DB design/normalisation folks would argue that the DB design should get to 4NF and that means any refactoring should solely be additive (denormalisation, adding columns/tables) and not breaking changes. People who don't believe in normalisation will be going nuts worried about shared schema.
I think it does.
I think the last sentence is the most interesting of all: "I tend to err on the side of the database-as-model camp, because I think objects are overrated." Java, C++, and C# are certainly the dominant languages, but functional programming is making a comeback with F#, Scala, etc.
This isn't my area of expertise, but I worked in Rails for about a year and I think ActiveRecord solved most of the DB Mapping problem. I realize it has a few issues, but I think it did a fantastic job.
I don't think his post took into account the possibility of the framework itself (in this case AcitveRecord/Rails) defining the database AND the Object Model, which--as far as I can tell--makes the problem go away.
Since this is the opposite of the first two responses (basically that the post is outdated) I feel like I'm probably not understanding something; If that's the case please correct me instead of just voting me down because I think there is an important point I'm missing.
IMHO, Monadic approaches like Scala's Slick and Quill largely bypass the aforementioned quagmire, offering more robust solutions to many of Ted's problems (also JOOQ deserves to be mentioned). While they're not perfect, they definitely kill the N+1 and Partial Object problems, mostly kill the Identity problem, and partially kill the Dual Schema problem. Forget about clumsy and verbose CriteriaBuilder queries (anyone read "Execution in the Kingdom of Nouns???"), Scala's monadic for-comprehensions give you a simple DSL in which to write criteria-queries:
case class Person(id: Int, name: String, age: Int)
case class Contact(personId: Int, phone: String)
val query = for {
p <- query[Person] if(p.id == 999)
c <- query[Contact] if(c.personId == p.id)
} yield {
(p.name, c.phone)
}
This kind of syntax stays sane, even for more complex queries e.g. Ted's is-acceptable-spouse query:
case class Person(name:String, spouse:Option[PersonId]) {
isThisAnAccpetableSpouse(person:Person) {...}
}
val query = for {
p1 <- people
p2 <- people if (
p1.spouse.isEmpty && p2.spouse.isEmpty
&& p1.isThisAnAccpetableSpouse(p2)
&& p1.isThisAnAccpetableSpouse(p1))
} yield (p1, p2)
All of this get's compiled into a single SELECT query and run against the database. Inserts, Updates, and Deletes are similar.
Disclosure: I'm the author of RDO.Net.
Yes. I believe existing ORMs are going in the wrong direction - they're trying to map relational data into arbitrary objects which IMHO is simply not possible. In the OO world, the arbitrary object is not serialization/deserialization friendly, because every object has an object reference (the address) that is local to the current process. It's OK when you're dealing with a single object, you will get into trouble when you're facing a complex object graph.
Hopefully this problem can be solved using a different approach, as in RDO.Net: instead of mapping relational data into arbitrary objects, we map the schema of the data as rich metadata (Model
that contains columns, primary/foreign keys, validations, etc.), and expose concrete Db
, DbTable
, DbQuery
and DataSet
for data access. When performing database CRUD operations, you construct the SQL Abstract Syntax Tree (AST) explicitly using strongly typed OO language, as demonstrated in the following sample C# code, which creates a query and returns a DataSet for hierarchical data:
public async Task<DataSet<SalesOrderInfo>> GetSalesOrderInfoAsync(_Int32 salesOrderID, CancellationToken ct = default(CancellationToken))
{
var result = CreateQuery((DbQueryBuilder builder, SalesOrderInfo _) =>
{
builder.From(SalesOrderHeader, out var o)
.LeftJoin(Customer, o.FK_Customer, out var c)
.LeftJoin(Address, o.FK_ShipToAddress, out var shipTo)
.LeftJoin(Address, o.FK_BillToAddress, out var billTo)
.AutoSelect()
.AutoSelect(c, _.Customer)
.AutoSelect(shipTo, _.ShipToAddress)
.AutoSelect(billTo, _.BillToAddress)
.Where(o.SalesOrderID == salesOrderID);
});
await result.CreateChildAsync(_ => _.SalesOrderDetails, (DbQueryBuilder builder, SalesOrderInfoDetail _) =>
{
builder.From(SalesOrderDetail, out var d)
.LeftJoin(Product, d.FK_Product, out var p)
.AutoSelect()
.AutoSelect(p, _.Product)
.OrderBy(d.SalesOrderDetailID);
}, ct);
return await result.ToDataSetAsync(ct);
}
© 2022 - 2024 — McMap. All rights reserved.