Why do subqueries with nested properties in EclipseLink always produce an unnecessary/redundant/duplicate/superfluous join?
Asked Answered
M

0

2

I wanted to generate the following SQL through EclipseLink (2.6.1) JPA criteria.

SELECT
    zonecharge0_.charge AS col_0_0_ 
FROM
    projectdb.zone_charge zonecharge0_ 
WHERE
    (
        EXISTS (
            SELECT
                country1_.country_id 
            FROM
                projectdb.country country1_ 
            WHERE
                country1_.country_id=? 
                and zonecharge0_.zone_id=country1_.zone_id
        )
    ) 
    AND (
        zonecharge0_.weight_id IN (
            SELECT
                MAX(weight2_.weight_id) 
            FROM
                projectdb.weight weight2_ 
            WHERE
                weight2_.weight BETWEEN 0 AND 60
        )
    )

Running both of the following criteria and JPQL queries on Hibernate (4.3.7 final) produce the above clean query that one can always intuitively expect.

CriteriaBuilder criteriaBuilder=entityManager.getCriteriaBuilder();
CriteriaQuery<BigDecimal>criteriaQuery=criteriaBuilder.createQuery(BigDecimal.class);
Metamodel metamodel = entityManager.getMetamodel();
Root<ZoneCharge> root = criteriaQuery.from(metamodel.entity(ZoneCharge.class));
criteriaQuery.select(root.get(ZoneCharge_.charge));

Subquery<Long> countrySubquery = criteriaQuery.subquery(Long.class);
Root<Country> countryRoot = countrySubquery.from(metamodel.entity(Country.class));
countrySubquery.select(countryRoot.get(Country_.countryId));

Predicate[] predicates=new Predicate[2];
predicates[0]=criteriaBuilder.equal(countryRoot, country);
predicates[1]=criteriaBuilder.equal(root.get(ZoneCharge_.zoneId), countryRoot.get(Country_.zoneId));
countrySubquery.where(predicates);

Subquery<Long> weightSubquery = criteriaQuery.subquery(Long.class);
Root<Weight> weightRoot = weightSubquery.from(metamodel.entity(Weight.class));
weightSubquery.select(criteriaBuilder.max(weightRoot.get(Weight_.weightId)));
weightSubquery.where(criteriaBuilder.between(weightRoot.get(Weight_.weight), BigDecimal.ZERO, weight));

predicates=new Predicate[2];
predicates[0]=criteriaBuilder.exists(countrySubquery);
predicates[1]=criteriaBuilder.in(root.get(ZoneCharge_.weightId).get(Weight_.weightId)).value(weightSubquery);

criteriaQuery.where(predicates);
BigDecimal charge = entityManager.createQuery(criteriaQuery).getSingleResult();

The corresponding JPQL :

SELECT zc.charge 
FROM   ZoneCharge AS zc 
WHERE  EXISTS(SELECT c.countryId 
              FROM   Country AS c 
              WHERE  zc.zoneId = c.zoneId 
                     AND c.countryId = 1) 
       AND zc.weightId.weightId IN(SELECT MAX(w.weightId) 
                                   FROM   Weight AS w 
                                   WHERE  w.weight BETWEEN 0 AND 60) 

Running these queries (both criteria and JPQL) without any modifications on EclipseLink (2.6.1) produce the following very ugly/cluttered query.

SELECT t0.charge 
FROM   projectdb.zone_charge t0, 
       projectdb.weight t1   /*Unnecessary table listing.*/
WHERE  ((EXISTS (SELECT t2.country_id 
                   FROM   projectdb.zone_table t3, /*Unnecessary table listing.*/
                          projectdb.country t2 
                   WHERE  ((( t2.country_id = ?) 
                              AND( t0.zone_id = t3.zone_id)) 
                            AND(t3.zone_id = t2.zone_id)))  /*Duplicate join.*/
           AND t1.weight_id IN (SELECT MAX(t4.weight_id) 
                                FROM   projectdb.weight t4 
                                WHERE  (t4.weight BETWEEN ? AND ?))) 
         AND (t1.weight_id = t0.weight_id))  /*Duplicate join.*/

I just want to fetch a scalar value of type BigDecimal. It doesn't require a join here nor I mentioned one though it produces two superfluous joins which are completely unnecessary.

The cause of these redundant joins are two nested properties in the predicates given.

root.get(ZoneCharge_.weightId).get(Weight_.weightId)

and

root.get(ZoneCharge_.zoneId), countryRoot.get(Country_.zoneId)

Look at the Predicate array.

I have been trying to know the reason for a long time but I don't find anything mentioned anywhere about this phenomenon in EclipseLink.

I don't think this could be an oversight. It looks a lot deliberate.

I'm very much disappointed to see this kind of statements. Is there any reason to get EclipseLink to generate this kind of redundant joins? Are there some settings or something else to get rid of them that I might be missing in EclipseLink? Can they be avoided anyway?


Additional Information :

The table structures are simple. Three tables are involved here namely.

  • zone_charge
  • country
  • weight

The zone_charge table has a column weight_id which is a foreign key referencing the corresponding primary key in the weight table.

The country and zone_charge tables both have a zone_id column which a foreign key of the zone_id column in zone_table (which is not mentioned here) in both the tables.

The query returns a charge value in zone_charge in a specific zone based on the value of weight. A user is assumed to be unaware of zones. Therefore, a zone is selected in the background by this query based on a country supplied by a user by selecting a country in a drop-box.


Update :

Some of the issues with different views and opinions:

All of them imply the same thing. Hibernate tends to be very clever in generating tidy SQL statements. I do not think that those who use EcliplseLink (which is in turn, one of the widely used ORM frameworks) like having this kind of untidy SQL statements been executed upon the database.

Do you really just keep going on with them?

Molehill answered 19/7, 2014 at 6:9 Comment(4)
Does someone know whether this will ever be taken into account in any subsequent future release of EclipseLink?Molehill
I think this question should go to the Eclipselink forums or supportProcurable
@AhmadHosny : It is already there.Molehill
Sorry @Tiny, I haven't seen that before. I wish I could help.Procurable

© 2022 - 2024 — McMap. All rights reserved.