I'm new to SQL and I have a question.
I have this SQL code:
DROP TABLE if exists students;
DROP TABLE if exists grades;
CREATE TABLE students(
s_id integer NOT NULL PRIMARY KEY,
s_name text,
s_last_name text,
curr_year integer
);
CREATE TABLE grades(
s_id integer NOT NULL PRIMARY KEY,
course text,
c_year integer,
grade integer,
FOREIGN KEY (s_id) REFERENCES students
);
INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (1, 'A', 'S', 3);
INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (2, 'A', 'A', 2);
INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (3, 'V', 'B', 1);
INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (4, 'K', 'N', 2);
INSERT INTO grades (s_id, course, c_year, grade)
VALUES (1, 'DB', 2, 98);
INSERT INTO grades (s_id, course, c_year, grade)
VALUES (2, 'OS', 3, 90);
INSERT INTO grades (s_id, course, c_year, grade)
VALUES (3, 'DB', 2, 94);
EXPLAIN ANALYZE
SELECT *
FROM students s JOIN grades gr
ON s.s_id = gr.s_id
WHERE curr_year > 0;
CREATE INDEX student_details ON students(s_id, s_name, s_last_name);
CREATE INDEX student_courses ON grades(s_id, course);
EXPLAIN ANALYZE
SELECT *
FROM students s JOIN grades gr
ON s.s_id = gr.s_id
WHERE curr_year > 0;
DROP INDEX student_details;
DROP INDEX student_courses;
DROP TABLE students CASCADE;
DROP TABLE grades CASCADE;
And I try to understand the explain outputs. Before the INDEX creating I got hash merge. Here is the explain output:
Hash Join (cost=23.50..51.74 rows=270 width=116) (actual time=0.039..0.050 rows=3 loops=1)
Hash Cond: (gr.s_id = s.s_id)
-> Seq Scan on grades gr (cost=0.00..21.30 rows=1130 width=44) (actual time=0.005..0.008 rows=3 loops=1)
-> Hash (cost=20.12..20.12 rows=270 width=72) (actual time=0.021..0.021 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on students s (cost=0.00..20.12 rows=270 width=72) (actual time=0.006..0.011 rows=4 loops=1)
Filter: (curr_year > 0)
Planning time: 0.147 ms
Execution time: 0.089 ms
(9 rows)
And after the INDEX creating I get Nested Loop:
Nested Loop (cost=0.00..2.12 rows=1 width=116) (actual time=0.031..0.116 rows=3 loops=1)
Join Filter: (s.s_id = gr.s_id)
Rows Removed by Join Filter: 9
-> Seq Scan on students s (cost=0.00..1.05 rows=1 width=72) (actual time=0.012..0.018 rows=4 loops=1)
Filter: (curr_year > 0)
-> Seq Scan on grades gr (cost=0.00..1.03 rows=3 width=44) (actual time=0.003..0.009 rows=3 loops=4)
Planning time: 0.396 ms
Execution time: 0.170 ms
But I can't figure out why? Why the indexing in my case made Nested Loop be preferred over Hash Join? I'll be very happy to get an explanation for that.
Thanks a lot!
sid
in both tables, so including this column at the start of the index will render the index useless. – Stotts