SQL query to select a column with expression of non-aggregate value and aggregate function
Asked Answered
D

3

6

Tables used:

1) v(date d, name c(25), desc c(50), debit n(7), credit n(7))

name in 'v' refers name in vn table

2) vn(date d, name c(25), type c(25), obal n(7))

name in 'vn' is a primary key and different names are grouped by type

ex: names abc, def, ghi belongs to type 'bank', names xyz, pqr belongs to type 'ledger', ...

I've a query like this:

SELECT vn.type, SUM(vn.obal + IIF(v.date < sd, v.credit-v.debit, 0)) OpBal, ;
    SUM(IIF(BETWEEN(v.date, sd, ed), v.credit-v.debit, 0)) CurBal ;
    FROM v, vn WHERE v.name = vn.name GROUP BY vn.type ;
    ORDER BY vn.type HAVING OpBal + CurBal != 0

It works fine but the only problem is, obal is a value which is entered only once per name in table 'vn' but with this query for every calculation of credit-debit in table 'v', obal is added multiple times and displayed under OpBal. When the query is modified like below:

SELECT vn.type, vn.obal + SUM(IIF(v.date < sd, v.credit-v.debit, 0)) OpBal, ;
    SUM(IIF(BETWEEN(v.date, sd, ed), v.credit-v.debit, 0)) CurBal ;
    FROM v, vn WHERE v.name = vn.name GROUP BY vn.type ;
    ORDER BY vn.type HAVING OpBal + CurBal != 0

it shows an error message like 'Group by clause is missing or invalid'!

RDBMS used MS Visual Foxpro 9. sd and ed are date type variables used for the purpose of query where sd < ed.

Please help me out getting the expected result. Thanks a lot.

Duomo answered 7/7, 2012 at 6:43 Comment(11)
You need to group by in the vn table and then join that result to the query you have.Abramson
@Dems: I was thinking what would be needed for a GROUP BY vn.type query. With a GROUP BY type, obal, I think your answer is fine. (By the way, is this HAVING after ORDER BY valid syntax in Foxpro? It makes me wonder)Abramson
And I assumed that the Primary Key of vn is name, by the description.Abramson
@ypercube - My bad,the PK is name, and not type. I think I just saw name where I should have seen type.Craw
@ypercube: I haven't been working with FoxPro for ages now, and I haven't ever worked in either FoxPro for Windows or Visual FoxPro, but the DOS versions had a certain degree of liberality towards the order of clauses in a SELECT statement, and maybe subsequent versions of the product retained that specimen of indulgence.Impedance
@Andriy: Backwadrs compatibility is a killer in DBMS products.Abramson
What are CurCr and CurDb? You are referencing them in the HAVING clause but they are nowhere else to be seen. Where do they come from? Are they columns or variables?Impedance
@Dems: I think you were either spot on or at least on the right track with your answer. Why did you delete it?Impedance
@AndriyM - On phone now, so not patient enough to edit. Answer probably needs two levels of ahgregation, and that requires more info. Don't have time to quiz the op about it now.Craw
@ypercube you are right, pk(vn) is name and yes VFP 9.0 does support HAVING clause!Duomo
@andriy-m it was a copy-paste mistake, CurCr - CurDb shoud be CurBal. Question edited nowDuomo
R
1

Once again, this question was bumped to front :( Before I made a comment like:

It is unfortunate that this question from 6 years back has been bumped to front page :( The reason is that the question is starving for some explanations and the design is screaming of flaws. Without knowing those details no answer would be good. 1) Never ever rely on the old enginebehavior to workaround this error. It was a BUG and corrected. Relying on a bug is not the way to go. 2) To create a date value, never ever do that by converting from a string. That is settings dependent. Better either use solid date(), datetime() functions or strict date\datetime literals.

That comment stands and there are more, anyway.

Let's add more about the flaws and a reply based on what I understand from the question.

In the question OP says:"

SELECT vn.type, SUM(vn.obal + IIF(v.date < sd, v.credit-v.debit, 0)) OpBal, ;
    SUM(IIF(BETWEEN(v.date, sd, ed), v.credit-v.debit, 0)) CurBal ;
    FROM v, vn WHERE v.name = vn.name GROUP BY vn.type ;
    ORDER BY vn.type HAVING OpBal + CurBal != 0

works fine."

But of course, any seasoned developer using SQL, would immediately see that it would not work fine, the results could only be fine coincidentally. Let's make it clear why it wouldn't really work. First let's create some cursors describing the OP's data:

CREATE CURSOR vn ( date d, name c(25), type c(25), obal n(7) )

INSERT INTO vn (date, name, type, obal) VALUES ( DATE(2012,5,20), "abc", "bank", 100 )
INSERT INTO vn (date, name, type, obal) VALUES ( DATE(2012,5,20), "def", "bank", 200 )
INSERT INTO vn (date, name, type, obal) VALUES ( DATE(2012,5,20), "ghi", "bank", 300 )
INSERT INTO vn (date, name, type, obal) VALUES ( DATE(2012,5,20), "xyz", "ledger", 400 )
INSERT INTO vn (date, name, type, obal) VALUES ( DATE(2012,5,20), "pqr", "ledger", 500 )

CREATE CURSOR v ( date d, name c(25), desc c(50), debit n(7), credit n(7))

INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,1), "abc", "description 1", 50, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,2), "abc", "description 1", 60, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,3), "abc", "description 1", 70, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,1), "def", "description 1", 50, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,2), "def", "description 1", 60, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,3), "def", "description 1", 70, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,1), "ghi", "description 1", 50, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,2), "ghi", "description 1", 60, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,4), "xyz", "description 1", 50, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,5), "xyz", "description 1", 60, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,6), "pqr", "description 1", 50, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,7), "pqr", "description 1", 60, 0 )
INSERT INTO V (date,name,desc,debit,credit) VALUES ( DATE(2012,6,8), "pqr", "description 1", 70, 0 )

Let's run OP's query on this data:

LOCAL sd,ed
sd = date(2012,6,1)  && start date of transactions
ed = DATE()          && current date as the end date...

SELECT vn.type, SUM(vn.obal + IIF(v.date < sd, v.credit-v.debit, 0)) OpBal, ;
    SUM(IIF(BETWEEN(v.date, sd, ed), v.credit-v.debit, 0)) CurBal ;
    FROM v, vn WHERE v.name = vn.name GROUP BY vn.type ;
    ORDER BY vn.type HAVING OpBal + CurBal != 0

We get a result like:

TYPE    OPBAL   CURBAL
------- -----   ------
bank     1500     -470
ledger   2300     -290

which is obviously incorrect. With a query like this, the more you get credit or debit, the more you have an opening balance. Let's see why is that happening, removing the group by and aggregation, check what we are really summing up:

SELECT vn.type, vn.name, v.date, vn.obal, v.credit, v.debit ;
    FROM v, vn ;
    WHERE v.name = vn.name

Output:

  TYPE     NAME DATE         OBAL CREDIT DEBIT
  bank     abc  06/01/2012    100      0    50
  bank     abc  06/02/2012    100      0    60
  bank     abc  06/03/2012    100      0    70
  bank     def  06/01/2012    200      0    50
  bank     def  06/02/2012    200      0    60
  bank     def  06/03/2012    200      0    70
  bank     ghi  06/01/2012    300      0    50
  bank     ghi  06/02/2012    300      0    60
  ledger   xyz  06/04/2012    400      0    50
  ledger   xyz  06/05/2012    400      0    60
  ledger   pqr  06/06/2012    500      0    50
  ledger   pqr  06/07/2012    500      0    60
  ledger   pqr  06/08/2012    500      0    70

You can see that, say for 'abc' OBal 100 is repeated 3 times, because of the 3 entries in v. Summing would make it 300 when it is just 100.

When you are using an aggregate like SUM() or AVG(), you should do the aggregation first without a join, then do your join. You can still do the aggregation, with a join, PROVIDED, the join results in a 1-to-many relation. If the above resultset were:

  TYPE   NAME OBAL CREDIT DEBIT
  bank    abc  100      0   180
  bank    def  200      0   180
  bank    ghi  300      0   110
  ledger  xyz  400      0   110
  ledger  pqr  500      0   180

it would be OK to SUM() BY type (1 side of 1-to-Many).

Having said that and adding VFP supports subqueries let's write a solution:

Local sd,ed
sd = Date(2012,6,1)  && start date of transactions
ed = Date()          && current date as the end date...

Select vn.Type, Sum(vn.OBal - tmp.TotCd) As Opbal, Sum(tmp.Curbal) As Curbal ;
    FROM vn ;
    LEFT Join ;
       (Select v.Name, Sum(Iif(v.Date < sd, v.credit-v.debit, 0)) TotCd, ;
               SUM(Iif(Between(v.Date, sd, ed), v.credit-v.debit, 0)) Curbal ;
        FROM v ;
        GROUP By v.Name ) tmp On tmp.Name = vn.Name ;
    GROUP By vn.Type ;
    ORDER By vn.Type ;
    HAVING Sum(vn.OBal - tmp.TotCd + tmp.Curbal) != 0

we get what we want:

TYPE    OPBAL   CURBAL
------- -----   ------
bank      600     -470
ledger    900     -290
Recollected answered 4/9, 2019 at 11:20 Comment(0)
F
0

I saw the SQL Syntax for SQL with VFP for the first time a few minutes ago, so this could well be full of errors, but as a 'guessful hunch':

SELECT vn.type, 
       SUM(vn.obal + (SELECT SUM(IIF(v.date < sd, v.credit-v.debit, 0)) 
                      FROM v 
                      WHERE v.name = vn.name)) OpBal,
       SUM(SELECT SUM(IIF(BETWEEN(v.date, sd, ed), v.credit-v.debit, 0))
           FROM v 
           WHERE v.name = vn.name) CurBal
FROM vn
GROUP BY vn.type
ORDER BY vn.type 
HAVING OpBal + CurBal != 0

Basically, I've just turned selection from v into subselects to avoid vn.obal to be repeated. It shouldn't matter for v that it first gets the sum for the individual person before summing them all together.

Fronde answered 7/7, 2012 at 12:27 Comment(1)
Set, sorry! It produces an error message: 'Function name is missing ).'Duomo
R
0

just a few things. VFP 9 has a setting to NOT require grouping for all non-aggregate for backward compatibility and similar results like MySQL where not all columns must be aggregates. Such as querying extra columns from a customer's record that never change no matter how many records you are joining against on its PK column (name, address, phone, whatever).

SET ENGINEBEHAVIOR 80

default for VFP 9

SET ENGINEBEHAVIOR 90

requires all non group-by columns to be aggregates to comply.

Next... it looks like you have very bad columns in the tables you are dealing with... 3 reserved words in VFP... "date", "Name" and "type", however you are ok by qualifying them in the query with alias.column reference.

The following sample code will create temporary tables (cursors) of the structures you've described in your question. I've also inserted some sample data and simulated your "sd" (start date) and "ed" (end date) variables

CREATE CURSOR vn;
   ( date d, ;
     name c(25), ;
     type c(25), ;
     obal n(7) )

INSERT INTO vn VALUES ( CTOD( "5/20/2012" ), "person 1", "person type 1", 125 )
INSERT INTO vn VALUES ( CTOD( "5/20/2012" ), "person 2", "another type ", 2155 )

CREATE CURSOR v;
   ( date d, ;
     name c(25), ;
     desc c(50), ;
     debit n(7), ;
     credit n(7))

INSERT INTO V VALUES ( CTOD( "6/1/2012" ), "person 1", "description 1", 10, 32 )
INSERT INTO V VALUES ( CTOD( "6/2/2012" ), "person 1", "desc 2", 235, 123 )
INSERT INTO V VALUES ( CTOD( "6/3/2012" ), "person 1", "desc 3", 22, 4 )
INSERT INTO V VALUES ( CTOD( "6/4/2012" ), "person 1", "desc 4", 53, 36 )
INSERT INTO V VALUES ( CTOD( "6/5/2012" ), "person 1", "desc 5", 31, 3 )
INSERT INTO V VALUES ( CTOD( "6/1/2012" ), "person 2", "another 1", 43, 664 )
INSERT INTO V VALUES ( CTOD( "6/4/2012" ), "person 2", "more desc", 78, 332 )
INSERT INTO V VALUES ( CTOD( "6/6/2012" ), "person 2", "anything", 366, 854 )

sd = CTOD( "6/3/2012" )      && start date of transactions
ed = DATE()  && current date as the end date...

Now, the querying... You are trying to get groups by type but the per person (name) needs to be pre-aggregated on a per person basis FIRST. Now, it appears you are trying to get a total opening balance of transactions prior to the start date (sd) as the basis at a given point in time, then looking at activity WITHIN the start/end date in question. Do this first, but don't deal with adding in the "obal" column from the "vn" table. Since it needs aggregates of non-group by columns, I would just use "MAX()" of the column. Since it is by a PK (name) basis, you'll end up with whatever it was, but with the rolled-up totals of transactions, yet have all your data pre-summarized into a single row via...

select;
      vn.name,;
      vn.type,;
      MAX( vn.obal ) as BalByNameOnly,;
      SUM( IIF( v.date < sd, v.credit-v.debit, 000000.00 )) OpBal, ;
      SUM( IIF( BETWEEN(v.date, sd, ed), v.credit - v.debit, 000000.00 )) CurBal ;
   FROM ;
      v,;
      vn ;
   WHERE ;
      v.name = vn.name;
   GROUP BY ;
      vn.Name,;
      vn.Type;
   INTO ;
      CURSOR C_JustByName READWRITE 

With this results (from my sample data) would look like...

Name      Type            BalByNameOnly   OpBal   CurBal
person 1  person type 1       125         -90     -63 
person 2  another type       2155         621     742

Your final aggregate to get by type, you can just query the above result "cursor" (C_JustByName) and use IT to get your grouping by type, having, etc... something like

SELECT ;
      JBN.type, ;
      JBN.BalByNameOnly - JBN.OpBal as OpBal,;
      JBN.CurBal ;
    FROM ;
       C_JustByName JBN ;
    GROUP BY ;
       vn.type ;
    ORDER BY ;
       vn.type ;
    HAVING ;
       OpBal + CurBal != 0;
    INTO ;
       CURSOR C_Final

Now, I am just simplifying the above because I don't know what you are really looking for as the date within your "VN" (appears to be like a customer table) with a date that is unclear of its purpose and its oBal column with respect to the transactions table.

The nice thing about VFP is that you can query into a temporary cursor without creating a permanent table and use IT as the basis for any querying after that... It helps in the readability of not having to nest query inside query inside query. It also allows you to see the results of each layer and know you are getting the answers you are EXPECTING before continuing to the next query phase...

Hopefully this will help you in the direction of what you are trying to solve.

Retrogression answered 14/7, 2012 at 2:10 Comment(1)
It is unfortunate that this question from 6 years back has been bumped to front page :( The reason is that the question is starving for some explanations and the design is screaming of flaws. Without knowing those details no answer would be good. 1) Never ever rely on the old enginebehavior to workaround this error. It was a BUG and corrected. Relying on a bug is not the way to go. 2) To create a date value, never ever do that by converting from a string. That is settings dependent. Better either use solid date(), datetime() functions or strict date\datetime literals.Recollected

© 2022 - 2024 — McMap. All rights reserved.