EDIT 2022/12/13: Summary: for best performance, the SQL need be no-join, no-union. That is how come the following solution.
I have first considered using join like in other ranked answers, but find that can not find true smallest unused id, e.g.,
3,5,6 should get 1 as smallest unused id, but their results are 4.
Another thing is that when the column is from a subquery, I don't want to copy the subquery again to join itself,
so I come up with another way to get true smallest unused id.
Assuming the id > 0 and is unique.
select unused_id as minimum_unused_id
from (
select
case
when id <> ifnull(lag(id) over (order by id), 0) + 1 -- when id <> prev_id_add_1
then ifnull(lag(id) over (order by id), 0) + 1 -- then prev_id_add_1
when id <> ifnull(lead(id) over (order by id), 0) - 1 -- when id <> next_id_dec_1
then id + 1 -- then id + 1
end
as unused_id
from (
select 1 as id from dual
union select 2 as id from dual
union select 4 as id from dual
union select 5 as id from dual
) unique_ids
order by id
) t
where unused_id is not null
limit 1
The result (smallest_unused_id) is
3
Please replace the unique_ids
subquery.
Other combination tested:
- 1,2,3,4,5 -> 6
- 3,4,5,7 -> 1
- 1,2,4,5 -> 3
Note that if no any record in unique_ids
subquery, then it means result is 1.
Explanation:
lag(id) over (order by id)
will get the column value of previous record. See https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_lag.
lead(id) over (order by id)
will get the column value of next record. See https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_lead.
select
lag(id) over (order by id) as prev_id,
id,
lead(id) over (order by id) as next_id
from (
select 2 as id from dual
union select 4 as id from dual
union select 5 as id from dual
union select 7 as id from dual
) YourSubQuery
order by id
will output
prev_id |
id |
next_id |
NULL |
2 |
4 |
2 |
4 |
5 |
4 |
5 |
7 |
5 |
7 |
NULL |
You can see all we want is the first id which id != prev_id+1
or next_id != id + 1
(treat null prev_id as 0, null next_id as 0).
You can copy all above SQLs to SQL fiddle https://www.db-fiddle.com/ to have a try.
Another benefit of this solution is that it can fully utilize the index, e.g.,
DROP TABLE IF EXISTS `t`;
CREATE TABLE `t` (
`id` BIGINT NOT NULL AUTO_INCREMENT,
`type` SMALLINT,
`sequence` SMALLINT,
`is_valid` TINYINT DEFAULT 1,
`del` CHAR(0) GENERATED ALWAYS AS (if(`is_valid` = 1,'',NULL)) STORED,
`other_columns` VARCHAR(100) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE INDEX `uniq_idx_t_type_seq` (`type`,`del`,`sequence`) USING BTREE
);
then add test data:
DELIMITER $$
DROP PROCEDURE IF EXISTS add_test_data;
CREATE PROCEDURE add_test_data()
begin
set @seq = 1;
while (@seq <= 32767) do
insert ignore into t (type, sequence) values(1, @seq);
set @seq = @seq + 1;
end while;
end; $$
DELIMITER ;
call add_test_data;
DROP PROCEDURE IF EXISTS add_test_data;
Then lets see the actual sql
select unused_seq
from (
select
case
when sequence <> ifnull(lag(sequence) over (order by sequence), 0) + 1
then ifnull(lag(sequence) over (order by sequence), 0) + 1
when sequence <> ifnull(lead(sequence) over (order by sequence), 0) - 1
then sequence + 1
end
as unused_seq
from
t
where (type = 1 and del = '')
order by sequence
) as t
where unused_seq is not null
limit 1
The result is 32767, the time cost is 90ms (in a docker container on Macbook Pro).
The execute plan shows that it use the index uniq_idx_t_type_seq
,
the order by sequence
does not cost anything because the index naturally is ordered in that order.
The above Query can be easily migrated to other type of database, because the Window function like lag/lead is common in nowdays.
EDIT: There are other simpler queries, such as as a colleague suggested, using the MySQL's Sequence Stroage,
SELECT * FROM seq_1_to_32767
EXCEPT
SELECT sequence from t where (type = 1 and del = '') order by sequence.
It is more elegant, the performance should be also good.