get JOIN table as array of results with PostgreSQL/NodeJS
Asked Answered
H

3

12

I'm creating an app where users are able to create questions, and others can upvote/downvote them.

The following is a part of my sql schema:

CREATE TABLE "questions" (
  id            SERIAL,
  content       VARCHAR(511) NOT NULL,
  created_at    TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
  CONSTRAINT    pk_question PRIMARY KEY (id)
);

CREATE TABLE "votes" (
  id            SERIAL,
  value         INT,
  question_id   INT NOT NULL,
  CONSTRAINT    pk_vote PRIMARY KEY (id),
  CONSTRAINT    fk_question_votes FOREIGN KEY (question_id) REFERENCES questions (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE
);

What I would like to have is Postgres giving me each question with an array of votes, like that:

[{ // a question
  id: 1,
  content: 'huh?',
  votes: [{ // a vote
    id: 1,
    value: 1
  }, { // another vote
    id: 2,
    value: -1
  }]
}, { /*another question with votes*/ }]

I looked at aggregate functions (like array_agg()) but it gave me only the values. A JOIN gave me a question joined with a vote, and would force me to do server side operations, which I would prefer not to.

Is there any way to do that? Is my reasoning regarding what I want to obtain wrong?

Thanks for your time.

Hoang answered 1/10, 2016 at 10:53 Comment(2)
I think you are looking for json_agg() and row_to_json()Dube
@a_horse_with_no_name since PostgreSQL 9.4, the pair of json_agg() + json_build_object() makes for a better solution. See my answer below.Lactoscope
L
11

This is easy to do with pg-promise:

function buildTree(t) {
    const v = q => t.any('SELECT id, value FROM votes WHERE question_id = $1', q.id)
        .then(votes => {
            q.votes = votes;
            return q;
        });

    return t.map('SELECT * FROM questions', undefined, v).then(a => t.batch(a));
}

db.task(buildTree)
    .then(data => {
        console.log(data); // your data tree
    })
    .catch(error => {
        console.log(error);
    });

The same as above, but using ES7 async/await syntax:

await db.task(async t => {
    const questions = await t.any('SELECT * FROM questions');
    for(const q of questions) {
        q.votes = await t.any('SELECT id, value FROM votes WHERE question_id = $1', [q.id]);
    }
    return questions;
});
// method "task" resolves with the correct data tree

API: map, any, task, batch


Related questions:


And if you want to use just a single query, then using PostgreSQL 9.4 and later syntax you can do the following:

SELECT json_build_object('id', q.id, 'content', q.content, 'votes',
    (SELECT json_agg(json_build_object('id', v.id, 'value', v.value))
     FROM votes v WHERE q.id = v.question_id))
FROM questions q

And then your pg-promise example would be:

const query =
    `SELECT json_build_object('id', q.id, 'content', q.content, 'votes',
        (SELECT json_agg(json_build_object('id', v.id, 'value', v.value))
         FROM votes v WHERE q.id = v.question_id)) json
    FROM questions q`;
    
const data = await db.map(query, [], a => a.json);

And you definitely will want to keep such complex queries in external SQL files. See Query Files.

Conclusion

The choice between the two approaches presented above should be based on the performance requirements of your application:

  • The single-query approach is faster, but is somewhat difficult to read or extend, being fairly verbose
  • The multi-query approach is easier to understand and to extend, but it is not great for performance, due to dynamic number of queries executed.

UPDATE-1

The following related answer offers more options, by concatenating child queries, which will give a much improved performance: Combine nested loop queries to parent result pg-promise.

UPDATE-2

Another example added, using ES7 async/await approach.

Lactoscope answered 1/10, 2016 at 14:34 Comment(5)
Thank you for your answer, I'm not a real fan of the multi-query approach since it doesn't scale well, but the other one with json_build_object is exactly what I was looking for!Hoang
Isn't both solutions are awful from a performance perspective? n + 1 queries (with the json_agg you "just" save the io, but the db still needs to run this query against every row, IIRC)Insult
@Insult The second approach, via one query is very fast, according to my tests.Lactoscope
Added ES7 example.Lactoscope
I like the one-query solution. I am using the Client Instance and call it like this: client.query(SELECT json_build_object("handle", handle), [param1]). Weirdly the response object key is json_build_object !? Response: {"json_build_object":{"handle", "1234"}}Springtail
H
2

Please think simple way, May be I am right, I use knex js

 let allpost = knex
        .select([
            'questions.id',
            'question.content',
            knex.raw('json_agg(v.*) as votes')
        ])
        .from('questions')
        .leftJoin('votes as v', 'questions.id', 'v.question_id')
        .groupBy('questions.id');
Hallucinogen answered 11/3, 2019 at 8:1 Comment(0)
K
0

sql-toolkit does exactly this. It's a node library built for pg-promise which allows you to write regular native SQL and receive back properly structured (nested) pure business objects, without either having to split up the query or rewrite it with json_build_object.

For example:

class Article extends BaseDAO {
  getBySlug(slug) {
    const query = `
      SELECT
        ${Article.getSQLSelectClause()},
        ${Person.getSQLSelectClause()},
        ${ArticleTag.getSQLSelectClause()},
        ${Tag.getSQLSelectClause()}
      FROM article
      JOIN person
        ON article.author_id = person.id
      LEFT JOIN article_tags
        ON article.id = article_tags.article_id
      LEFT JOIN tag
        ON article_tags.tag_id = tag.id
      WHERE article.slug = $(slug);
  `;
  return this.one(query, { slug });
  // OUTPUT: Article {person: Person, tags: Tags[Tag, Tag, Tag]}
}

The select clause uses the business object "getSQLSelectClause" methods to save tedium in typing the columns, as well as ensure no collisions of names (nothing magical going on, and could just be written out instead).

The this.one is a call into sql-toolkits base DAO class. It is responsible for structuring the flat result records into a nice nested structure.

(Also notice that it is "one" which matches our mental model for the SQL. The DAO methods for one, oneOrNone, many, and any ensure their count against the number of generated top level business objects - not the number of rows the sql expression returns!)

Check out the repository for details on how to set it up on top of pg-promise. It's strictly an enhancement, and does not seek to abstract out pg-promise (you still set up pg-promise and can use it directly). (Disclamer, I am the author of sql-toolkit.)

Kasandrakasevich answered 12/2, 2019 at 2:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.