PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
Asked Answered
S

3

12

I have a long chain of associations, joins, order by, etc., that is ultimately selecting from one of my rails models. At the end of the day I need the results to be unique and sorted. I don't care what columns are used in the SELECT statement, what appears in the ORDER BY, etc. (these all change based on the filtering options the user has selected), I just care that the top level model/table in the query is unique (based on id).

For background, widgets is the main table, and we are joining with widget_steps, and this is in Rails 3 (company is trying to upgrade ASAP but that's what they're stuck with at the moment)

Here is the query and error that is being generated:

PG::InvalidColumnReference: ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ...completed_at" IS NULL)) ORDER BY sequential DESC, widget_s...
                                                             ^
: SELECT  DISTINCT "widgets".* FROM "widgets" INNER JOIN "widget_steps" ON "widget_steps"."widget_id" = "widgets"."id" INNER JOIN "widget_steps" "active_steps_widgets" ON "active_steps_widgets"."id" = "widgets"."active_widget_step_id" WHERE "widgets"."account_id" = 1 AND "widgets"."completed_at" IS NULL AND (("widgets"."sequential" = 't' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."id" = "widgets"."active_widget_step_id" AND "widget_steps"."completed_at" IS NULL) OR ("widgets"."sequential" = 'f' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."completed_at" IS NULL)) ORDER BY sequential DESC, widget_steps.name ASC LIMIT 10 OFFSET 0
Completed 500 Internal Server Error in 52.3ms

ActiveRecord::StatementInvalid - PG::InvalidColumnReference: ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ...completed_at" IS NULL)) ORDER BY sequential DESC, widget_s...
                                                             ^
: SELECT  DISTINCT "widgets".* FROM "widgets" INNER JOIN "widget_steps" ON "widget_steps"."widget_id" = "widgets"."id" INNER JOIN "widget_steps" "active_steps_widgets" ON "active_steps_widgets"."id" = "widgets"."active_widget_step_id" WHERE "widgets"."account_id" = 1 AND "widgets"."completed_at" IS NULL AND (("widgets"."sequential" = 't' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."id" = "widgets"."active_widget_step_id" AND "widget_steps"."completed_at" IS NULL) OR ("widgets"."sequential" = 'f' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."completed_at" IS NULL)) ORDER BY sequential DESC, widget_steps.name ASC LIMIT 10 OFFSET 0:

Why is this a thing? What does postgres think is so ambiguous about this? Why do queries like this always work fine in MySQL but make postgres choke.

I have tried:

  1. specifying .select([everything mentioned in the order by]).uniq at the end of the chain
  2. specifying .uniq at the end of the chain without doing a custom select
  3. writing some custom AREL to try to embed all of this in a sub query and then do the .uniq or the .order outside of this (can't get this working)
  4. doing the .uniq outside of postgres (this breaks because of pagination ... you can end up with some pages that only have 1 or 2 items on them because of duplicates being removed)
  5. crying
Shih answered 16/3, 2017 at 22:58 Comment(3)
What is sequential here? Is it column of some table or something else?Bonniebonns
it is a column on widgetShih
Think about it: when there are (at least) two rows of widget_steps, which satisfies your condition for a single row of widget, which one's name should be used in the ORDER BY? PostgreSQL won't decide this instead of you, that's why it says that put widget_steps.name into the SELECT clause (to make that one also DISTINCT). OFC, MySQL doesn't bother with such kind of nonsenses: it allows you to run whatever wrong query you come up with.Hairraising
B
13

You need to add widget_steps.name to list of selected columns:

SELECT DISTINCT "widgets".*,
                "widget_steps.name"
FROM "widgets"
INNER JOIN "widget_steps" ON "widget_steps"."widget_id" = "widgets"."id"
INNER JOIN "widget_steps" "active_steps_widgets" ON "active_steps_widgets"."id" = "widgets"."active_widget_step_id"
WHERE "widgets"."account_id" = 1
  AND "widgets"."completed_at" IS NULL
  AND (("widgets"."sequential" = 't'
        AND "widget_steps"."assigned_to" = 5
        AND "widget_steps"."id" = "widgets"."active_widget_step_id"
        AND "widget_steps"."completed_at" IS NULL)
       OR ("widgets"."sequential" = 'f'
           AND "widget_steps"."assigned_to" = 5
           AND "widget_steps"."completed_at" IS NULL))
ORDER BY sequential DESC,
         widget_steps.name ASC
LIMIT 10
OFFSET 0

This should not change logic of your query and will work fine.

In Rails you may use select method to set list of selected columns:

Widget.select('"widgets".*, "widget_steps.name"')

Hope this would help.

Bonniebonns answered 17/3, 2017 at 1:7 Comment(1)
this worked! thanks! though note that widget_steps.name isn't a thing -- had to use widgets.nameShih
C
3

Another option with this error that may work for some cases, such as a has_and_belongs_to_many type relation, is to unscope order:

class Resource < ApplicationRecord
  has_and_belongs_to_many :things
  scope :with_blue_things, -> { joins(:things).where(:things => {:color => :blue}).unscope(:order).distinct }

This should let you do all these

Resource.with_blue_things
Resource.with_blue_things.count
Resource.with_blue_things.order(:name).count
Carthy answered 19/11, 2020 at 20:9 Comment(0)
R
0

Suggestion:

You can add has_many :steps in Widget class, And use Widget.includes(:steps) to exec a query with Order clause which order by widget_steps columns

Rebeca answered 17/11, 2020 at 3:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.