How to pivot in SQLite or i.e. select in wide format a table stored in long format?
Asked Answered
H

6

53

I'd like to get a table which stores students data in long format and the marks they receive for all of their subjects in one query.

This is my table structure:

Table: markdetails

## studid ## ## subjectid ##  ## marks ##
     A1            3                50
     A1            4                60
     A1            5                70
     B1            3                60
     B1            4                80
     C1            5                95

Table: student info

Actual Structure:

## studid ##  ## name ##
      A1          Raam
      B1          Vivek
      c1          Alex

I want the result set to have the following wide format structure as result of the pivotization:

Table: Student Info

## studid ## ## name## ## subjectid_3 ## ## subjectid_4 ## ## subjectid_5 ##
      A1        Raam        50                60                 70
      B1        Vivek       60                80                null
      c1        Alex       null              null                95

How can I accomplish this in SQLite?

Henhouse answered 6/8, 2009 at 5:5 Comment(1)
Check out this extension.Simonette
G
30

First you need to change the current table to a temp table:

alter table student_info rename to student_name

Then, you'll want to recreate student_info:

create table student_info add column (
    stuid VARCHAR(5) PRIMARY KEY,
    name VARCHAR(255),
    subjectid_3 INTEGER,
    subjectid_4 INTEGER,
    subjectid_5 INTEGER
)

Then, populate student_info:

insert into student_info
select
    u.stuid,
    u.name,
    s3.marks as subjectid_3,
    s4.marks as subjectid_4,
    s5.marks as subjectid_5
from
    student_temp u
    left outer join markdetails s3 on
        u.stuid = s3.stuid
        and s3.subjectid = 3
    left outer join markdetails s4 on
        u.stuid = s4.stuid
        and s4.subjectid = 4
    left outer join markdetails s5 on
        u.stuid = s5.stuid
        and s5.subjectid = 5

Now, just drop your temp table:

drop table student_temp

And that's how you can quickly update your table.

SQLite lacks a pivot function, so the best you can do is hard-code some left joins. A left join will bring match any rows in its join conditions and return null for any rows from the first, or left, table that don't meet the join conditions for the second table.

Gilgilba answered 6/8, 2009 at 5:14 Comment(8)
thanks Eric...the querry works fine for getting the entire details about a student.but i want to modify the content & columns in a table.I think u did not get my question.i want to change the table.Henhouse
@arams: Fantastic, glad to hear it! Please upvote/mark this as the answer if it solved your problem!Gilgilba
SQlite does support VIEWs though, so prevent data segmentation and use that?Liuka
It might be a typo on the first line. Perhaps it should read alter table student_info rename to student_temp instead?Tarpon
@Gilgilba My application of this, requires even more columns (369 to be exact; as opposed to 3 subjects in OP's application). But an sqlite statement cannot execute more than 64 joins. Can you suggest edits here to accomplish more than 64 joins? Or is there a workaround?Smoke
Found the workaround. Referred to haridsv's https://mcmap.net/q/338272/-how-to-pivot-in-sqlite-or-i-e-select-in-wide-format-a-table-stored-in-long-format answer!Smoke
"Now, just drop your temp table: drop table student_temp", where does this student_temp temp table coming from, Eric?Assiduous
Note that storing data in long format is very extensible as the OP already does while changing the underlying schema structure to wide format isn't i.e. it's too specialized and breaks any dependent code more often. I don't see this answer as the best alternative to provide a solution for the OP.Dragrope
S
48

Here is the SQL to create the schema for this example. For anyone who wants to try the solution from @Eric.

create table markdetails (studid, subjectid, marks);
create table student_info (studid, name);

insert into markdetails values('A1', 3, 50);
insert into markdetails values('A1', 4, 60);
insert into markdetails values('A1', 5, 70);
insert into markdetails values('B1', 3, 60);
insert into markdetails values('B1', 4, 80);
insert into markdetails values('C1', 5, 95);

insert into student_info values('A1', 'Raam');
insert into student_info values('B1', 'Vivek');
insert into student_info values('C1', 'Alex');

Here is an alternative solution using case with group by.

select
    si.studid,
    si.name,
    sum(case when md.subjectid = 3 then md.marks end) subjectid_3,
    sum(case when md.subjectid = 4 then md.marks end) subjectid_4,
    sum(case when md.subjectid = 5 then md.marks end) subjectid_5
from student_info si
join markdetails md on
        md.studid = si.studid
group by si.studid, si.name
;

For comparison, here is the same select statement from @Eric's solution:

select
    u.stuid,
    u.name,
    s3.marks as subjectid_3,
    s4.marks as subjectid_4,
    s5.marks as subjectid_5
from
    student_info u
    left outer join markdetails s3 on
        u.stuid = s3.stuid
        and s3.subjectid = 3
    left outer join markdetails s4 on
        u.stuid = s4.stuid
        and s4.subjectid = 4
    left outer join markdetails s5 on
        u.stuid = s5.stuid
        and s5.subjectid = 5
;
Shoat answered 24/11, 2011 at 2:52 Comment(6)
I had the opportunity to sort of test this, on a table with about 150,000 rows. One complication is that I don't know the number of columns in advance, so I have to do a little pre-processing to determine the number of needed columns. Also, not all rows have the same number of data. With the outer join method, it took my PC 50 seconds. With the case when method, it took 15 seconds. Using a combination of reshape2 and plyr (I'm using R to run sqlite), it took about 1,040 seconds. You mileage may vary, however.Frown
@Chow, totally agree. my table is with 280,000 rows and it took like 20 seconds. This answer should be on top.Smoke
@Shoat nice and better answer. Also, good read: modern-sql.com/use-case/pivotCollencollenchyma
For anyone who tries this with strings, use MAX instead of SUM in the pivot CASE lines. Otherwise your strings will get interpreted as numbers with possibly bizarre results.Amalamalbena
Did you add the appropriate indexes? e.g. (subjectid), (studid) and on (studid, name)Dragrope
I first tried Eric's solution (with multiple inner joins) but SQLite reached the limit of 200 'from' clauses and crashed. So I tried the 'case when' solution, and it worked well with 260 'case when' instanced querying a 20M rows database and took 2 mins to run.Greaten
G
30

First you need to change the current table to a temp table:

alter table student_info rename to student_name

Then, you'll want to recreate student_info:

create table student_info add column (
    stuid VARCHAR(5) PRIMARY KEY,
    name VARCHAR(255),
    subjectid_3 INTEGER,
    subjectid_4 INTEGER,
    subjectid_5 INTEGER
)

Then, populate student_info:

insert into student_info
select
    u.stuid,
    u.name,
    s3.marks as subjectid_3,
    s4.marks as subjectid_4,
    s5.marks as subjectid_5
from
    student_temp u
    left outer join markdetails s3 on
        u.stuid = s3.stuid
        and s3.subjectid = 3
    left outer join markdetails s4 on
        u.stuid = s4.stuid
        and s4.subjectid = 4
    left outer join markdetails s5 on
        u.stuid = s5.stuid
        and s5.subjectid = 5

Now, just drop your temp table:

drop table student_temp

And that's how you can quickly update your table.

SQLite lacks a pivot function, so the best you can do is hard-code some left joins. A left join will bring match any rows in its join conditions and return null for any rows from the first, or left, table that don't meet the join conditions for the second table.

Gilgilba answered 6/8, 2009 at 5:14 Comment(8)
thanks Eric...the querry works fine for getting the entire details about a student.but i want to modify the content & columns in a table.I think u did not get my question.i want to change the table.Henhouse
@arams: Fantastic, glad to hear it! Please upvote/mark this as the answer if it solved your problem!Gilgilba
SQlite does support VIEWs though, so prevent data segmentation and use that?Liuka
It might be a typo on the first line. Perhaps it should read alter table student_info rename to student_temp instead?Tarpon
@Gilgilba My application of this, requires even more columns (369 to be exact; as opposed to 3 subjects in OP's application). But an sqlite statement cannot execute more than 64 joins. Can you suggest edits here to accomplish more than 64 joins? Or is there a workaround?Smoke
Found the workaround. Referred to haridsv's https://mcmap.net/q/338272/-how-to-pivot-in-sqlite-or-i-e-select-in-wide-format-a-table-stored-in-long-format answer!Smoke
"Now, just drop your temp table: drop table student_temp", where does this student_temp temp table coming from, Eric?Assiduous
Note that storing data in long format is very extensible as the OP already does while changing the underlying schema structure to wide format isn't i.e. it's too specialized and breaks any dependent code more often. I don't see this answer as the best alternative to provide a solution for the OP.Dragrope
I
12

great appendix! helped me to solve a similar problem with low effort and system load. I am using a Raspberry Pi to obtain 1wire-interface DS18B20 temperature sensor data as follows:

CREATE TABLE temps (Timestamp DATETIME, sensorID TEXT, temperature NUMERIC);

example:

sqlite> .headers on
sqlite> .mode column
sqlite> select * from temps where timestamp > '2014-02-24 22:00:00';

Timestamp            sensorID         temperature
-------------------  ---------------  -----------
2014-02-24 22:00:02  28-0000055f3f10  19.937
2014-02-24 22:00:03  28-0000055f0378  19.687
2014-02-24 22:00:04  28-0000055eb504  19.937
2014-02-24 22:00:05  28-0000055f92f2  19.937
2014-02-24 22:00:06  28-0000055eef29  19.812
2014-02-24 22:00:07  28-0000055f7619  19.625
2014-02-24 22:00:08  28-0000055edf01  19.687
2014-02-24 22:00:09  28-0000055effda  19.812
2014-02-24 22:00:09  28-0000055e5ef2  19.875
2014-02-24 22:00:10  28-0000055f1b83  19.812
2014-02-24 22:10:03  28-0000055f3f10  19.937
2014-02-24 22:10:04  28-0000055f0378  19.75
2014-02-24 22:10:04  28-0000055eb504  19.937
2014-02-24 22:10:05  28-0000055f92f2  19.937

using the SUBSTR() command I am "normalizing" the Timestamps to 10 minutes periods. With JOIN the sensorID is changed into a SensorName using the lookup-table 'sensors'

CREATE VIEW [TempsSlot10min] AS
SELECT SUBSTR(datetime(timestamp),1,15)||'0:00' AS TimeSlot,
SensorName,
temperature FROM
temps JOIN sensors USING (sensorID, sensorID);

example:

sqlite> select * from TempsSlot10min where timeslot >= '2014-02-24 22:00:00';

TimeSlot             SensorName  temperature
-------------------  ----------  -----------
2014-02-24 22:00:00  T1          19.937
2014-02-24 22:00:00  T2          19.687
2014-02-24 22:00:00  T3          19.937
2014-02-24 22:00:00  T4          19.937
2014-02-24 22:00:00  T5          19.812
2014-02-24 22:00:00  T6          19.625
2014-02-24 22:00:00  T10         19.687
2014-02-24 22:00:00  T9          19.812
2014-02-24 22:00:00  T8          19.875
2014-02-24 22:00:00  T7          19.812
2014-02-24 22:10:00  T1          19.937
2014-02-24 22:10:00  T2          19.75
2014-02-24 22:10:00  T3          19.937
2014-02-24 22:10:00  T4          19.937
2014-02-24 22:10:00  T5          19.875

now, the magic happens with the above mentioned CASE instruction.

CREATE VIEW [PivotTemps10min] AS
SELECT TimeSlot,
AVG(CASE WHEN sensorName = 'T1' THEN temperature END) AS T1,
AVG(CASE WHEN sensorName = 'T2' THEN temperature END) AS T2,
...
AVG(CASE WHEN sensorName = 'T10' THEN temperature END) AS T10
FROM TempsSlot10min
GROUP BY TimeSlot;

example:

select * from PivotTemps10min where timeslot >= '2014-02-24 22:00:00';

TimeSlot             T1          T2              T10
-------------------  ----------  ---------- ...  ----------
2014-02-24 22:00:00  19.937      19.687          19.687
2014-02-24 22:10:00  19.937      19.75           19.687
2014-02-24 22:20:00  19.937      19.75           19.687
2014-02-24 22:30:00  20.125      19.937          19.937
2014-02-24 22:40:00  20.187      20.0            19.937
2014-02-24 22:50:00  20.25       20.062          20.062
2014-02-24 23:00:00  20.25       20.062          20.062

The only problem remaining here is that the sensorName 'T1' ... 'T10' is now hardcoded into the VIEW [PivotTemps10min] and not taken from the lookup table.

Nonetheless, thank you very much for the answers in this thead!

Invar answered 24/2, 2014 at 21:52 Comment(2)
I am sure a large number of IoT enthusiasts using SQL will refer to this. My application is almost exactly the same.Smoke
1 query caused be 13 hours of torment, and 1 answer brings me back to sanity. Thank you good sirMonadism
H
5

Thanks to @pospec4444's link here is modified version of @haridsv's awesome answer. It uses filter clause to be little more concise

select
    si.studid,
    si.name,
    sum(md.marks) filter(where md.subjectid = 3) subjectid_3,
    sum(md.marks) filter(where md.subjectid = 4) subjectid_4,
    sum(md.marks) filter(where md.subjectid = 5) subjectid_5
from student_info si
join markdetails md on
        md.studid = si.studid
group by si.studid, si.name
;
Housebreaker answered 31/12, 2019 at 5:19 Comment(0)
P
0

If you have a simpler requirement of bundling together the children in the same field, group_concat is your friend.

Huge thanks to Simon Slaver from this thread: http://sqlite.1065341.n5.nabble.com/Howto-pivot-in-SQLite-tp26766p26771.html

Politicize answered 26/8, 2016 at 16:18 Comment(2)
From the Help Center: Links to external resources are encouraged, but please add context around the link so your fellow users will have some idea what it is and why it’s there. Always quote the most relevant part of an important link, in case the target site is unreachable or goes permanently offline.Peripteral
Link doesnt workMonadism
C
0

The cleanest solution, in this case, would be to use filter.

Generally speaking, there are other ways of pivoting data in SQLite — namely, using dynamic SQL or the pivot extension. They are useful when you don't know the specific columns in advance. See Building a Pivot Table in SQLite for details. But they would be overkill in this particular case.

Calvities answered 9/2, 2023 at 21:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.