Get minimum unused value in MySQL column
Asked Answered
Z

5

6

I have a table with integer ID column. I would like to get the minimum unused value for this column. The query should find the first hole in table IDs and get the minimum value inside it. I'll try to explain it with some examples.

Example 1: no-holes table

In this case, I have a table without holes and query should simply get the minimum unused value: should get: 4

|id|
|1 |
|2 |
|3 |

Example 2: table with hole on top

In this case, we have a hole on top (missing value: 1). The query finds the hole and gets the minimum value inside it: should get 1.

|id|
|2 |
|3 |
|4 |

Also in this case, we have a hole on top, but we have more missing values inside it (missing values: 1 and 2). The query finds the hole and gets the minimum value inside it: should get 1.

|id|
|3 |
|4 |
|5 |

Example 3: table with hole in the middle

In this case, we have a hole in the middle (missing values: 2 and 3). The query finds the hole and gets the minimum value inside it: should get 2.

|id|
|1 |
|4 |
|5 |

Example 4: table with holes on top and in the middle

In this case, we have multiple holes: one on top (missing value: 1) and one in the middle (missing value: 3). The query finds the first hole and gets the minimum value inside it: should get 1.

|id|
|2 |
|4 |
|6 |

I've tried the solution proposed in this post, but it doesn't work as expected in my case. Any ideas?

Zeller answered 8/9, 2014 at 7:58 Comment(0)
R
14
SELECT min(unused) AS unused
FROM (
    SELECT MIN(t1.id)+1 as unused
    FROM yourTable AS t1
    WHERE NOT EXISTS (SELECT * FROM yourTable AS t2 WHERE t2.id = t1.id+1)
    UNION
    -- Special case for missing the first row
    SELECT 1
    FROM DUAL
    WHERE NOT EXISTS (SELECT * FROM yourTable WHERE id = 1)
) AS subquery
Rectory answered 8/9, 2014 at 8:12 Comment(4)
@Rectory And how do I include other where conditions in this query? Fo example if I want to select a number from a collection of rows with SubId: 10?Cordiacordial
How does that other condition fit in with finding holes in the ID sequence? If ID = 10 fits the condition and ID = 11 doesn't fit the condition, should it return 11?Rectory
@Rectory Good I checked this answer again after linking it to my new question or I would not have seen your reply without the @... No, I mean I want the query to work with rows which has specific customId as well. So, for ex., The table could contain rows from 1 to 10 with 2 missing for every customId, let's say it is 1 and 2. In total this would be 20 rows, however I need to select only from a select of the customId 1.Cordiacordial
So the query would work only with ten rows in this example, and give me the missing value from this, which is what I want, as opposed to scanning the whole table.Cordiacordial
G
6

A slightly different way to do it using a join rather than EXISTS:-

SELECT MIN(t1.id)
FROM 
(
    SELECT 1 AS id
    UNION ALL
    SELECT id + 1
    FROM yourTable
) t1
LEFT OUTER JOIN yourTable t2
ON t1.id = t2.id
WHERE t2.id IS NULL;

Down side of any solution using a sub query is that they are not likely to use any indexes

Gallican answered 9/9, 2014 at 9:29 Comment(1)
Wow, simple and effective! Thanks very much and +1 for the help!!Zeller
W
1

You can create a table with just numbers in it. I'm simulating this table in below query. Then you can left join this table.

SELECT
MIN(numbers.n) AS missing_value
FROM (SELECT 1 as n UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4) numbers
LEFT JOIN your_table yt ON numbers.n = yt.id
WHERE yt.id IS NULL
Whitish answered 8/9, 2014 at 8:5 Comment(2)
This solution may work but forces me to create a table with all numbers... from 0 to infinite... am I right?Zeller
The sub query does that and if the range is reasonable it is easy to expand the sub query to provide a large range (you use a couple of sub queries cross joined together, each getting the digits 0 to 9, and then use one as units, one as tens, one as hundreds, etc)Gallican
M
1

EDIT 2022/12/13: Summary: for best performance, the SQL need be no-join, no-union. That is how come the following solution.

I have first considered using join like in other ranked answers, but find that can not find true smallest unused id, e.g.,

3,5,6 should get 1 as smallest unused id, but their results are 4.

Another thing is that when the column is from a subquery, I don't want to copy the subquery again to join itself,

so I come up with another way to get true smallest unused id.

Assuming the id > 0 and is unique.

select unused_id as minimum_unused_id
from (
    select
        case
        when id <> ifnull(lag(id) over (order by id), 0) + 1       -- when id <> prev_id_add_1
            then ifnull(lag(id) over (order by id), 0) + 1         -- then prev_id_add_1
        when id <> ifnull(lead(id) over (order by id), 0) - 1      -- when id <> next_id_dec_1
            then id + 1                                            -- then id + 1
        end
        as unused_id
    from (
        select 1 as id from dual
        union select 2 as id from dual
        union select 4 as id from dual
        union select 5 as id from dual
    ) unique_ids
    order by id
) t
where unused_id is not null
limit 1

The result (smallest_unused_id) is

3

Please replace the unique_ids subquery.

Other combination tested:

  • 1,2,3,4,5 -> 6
  • 3,4,5,7 -> 1
  • 1,2,4,5 -> 3

Note that if no any record in unique_ids subquery, then it means result is 1.

Explanation:

lag(id) over (order by id)

will get the column value of previous record. See https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_lag.

lead(id) over (order by id)

will get the column value of next record. See https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_lead.

select
    lag(id) over (order by id) as prev_id,
    id,
    lead(id) over (order by id) as next_id
from (
    select 2 as id from dual
    union select 4 as id from dual
    union select 5 as id from dual
    union select 7 as id from dual
) YourSubQuery
order by id

will output

prev_id id next_id
NULL 2 4
2 4 5
4 5 7
5 7 NULL

You can see all we want is the first id which id != prev_id+1 or next_id != id + 1 (treat null prev_id as 0, null next_id as 0).

You can copy all above SQLs to SQL fiddle https://www.db-fiddle.com/ to have a try.

Another benefit of this solution is that it can fully utilize the index, e.g.,

DROP TABLE IF EXISTS `t`;
CREATE TABLE `t` (
  `id` BIGINT NOT NULL AUTO_INCREMENT,
  `type` SMALLINT,
  `sequence` SMALLINT,
  `is_valid` TINYINT DEFAULT 1,
  `del` CHAR(0) GENERATED ALWAYS AS (if(`is_valid` = 1,'',NULL)) STORED,
  `other_columns` VARCHAR(100) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE INDEX `uniq_idx_t_type_seq` (`type`,`del`,`sequence`) USING BTREE
);

then add test data:

DELIMITER $$
DROP PROCEDURE IF EXISTS add_test_data;
CREATE PROCEDURE add_test_data()
begin
  set @seq = 1;
  while (@seq <= 32767) do
    insert ignore into t (type, sequence) values(1, @seq);
    set @seq = @seq + 1;
  end while;
end; $$
DELIMITER ;

call add_test_data;

DROP PROCEDURE IF EXISTS add_test_data;

Then lets see the actual sql

select unused_seq
from (
    select
        case
        when sequence <> ifnull(lag(sequence) over (order by sequence), 0) + 1
            then ifnull(lag(sequence) over (order by sequence), 0) + 1
        when sequence <> ifnull(lead(sequence) over (order by sequence), 0) - 1
            then sequence + 1
        end
        as unused_seq
    from
        t
    where (type = 1 and del = '')
    order by sequence
) as t
where unused_seq is not null
limit 1

The result is 32767, the time cost is 90ms (in a docker container on Macbook Pro).

The execute plan shows that it use the index uniq_idx_t_type_seq, the order by sequence does not cost anything because the index naturally is ordered in that order.

The above Query can be easily migrated to other type of database, because the Window function like lag/lead is common in nowdays.

EDIT: There are other simpler queries, such as as a colleague suggested, using the MySQL's Sequence Stroage,

SELECT * FROM seq_1_to_32767
   EXCEPT
   SELECT sequence from t where (type = 1 and del = '') order by sequence.

It is more elegant, the performance should be also good.

Mortise answered 26/6, 2022 at 4:28 Comment(0)
R
-1

If you have values from 1 to n in some other table say t2 then by simply checking

select min(id1) from t2 where id1 not exist(select id from t1);

you will get your answer;

Roentgenoscope answered 8/9, 2014 at 8:7 Comment(1)
This isn't even legal mysqlVacillation

© 2022 - 2024 — McMap. All rights reserved.