Tutorial : Generating a list of missing numbers

A reader recently asked a question about generating a list of missing numbers from a column.  They were using an auto-incremental number column and due to deletes done on the table by different transactions, they had gaps in the values in this column.  They were not interested in re-using those values but wanted to generate a quick report to see the patterns of the gaps created.

Generating such a list is pretty simple by using a number table or by using a UDF to simulate such a number table.  We had covered in our blog posts before how to go about writing such a UDF – you can read more on it here (for SQL Server) and here (for Oracle).  For this post, we will use sample code for SQL Server 2008.

/*Declare a table variable and insert some records into it
Simulate the gaps by creating records with missing numbers - 4, 6, 8, 12, 13 and 14*/
INSERT INTO @TABLE VALUES (1), (2), (3), (5), (7), (9), (10), (11), (15), (16);

Now, let’s make use of that UDF that we had shown in one of our previous posts and generate a list of running numbers.  You will see that that function takes in two input parameters – one for the number of rows that need to be returned and the other one which feeds it the starting value from which that number list starts from.  Since in this case, the reader wanted it to be starting from 1, the second parameter has a value of 1 and the first parameter i.e. total number of rows is the maximum value in the table which in our example above is 16.

So, now all that we need to do is do a left outer join between the UDF and the table variable above:

declare @max_val int
select @max_val = MAX(id_val) from @table

select x.data_value, y.id_val
from dbo.UDF_GEN_SEQUENCE (@max_val, 1) as x
left outer join @TABLE as y
on x.data_value = y.ID_VAL

Based on this, we will get all the data from the UDF which will provide us with the list of the running numbers and only the matching records from the table variable.  Now, it is as simple as adding a IS NULL check in the where clause to get only those numbers that are missing from our table variable.

select x.data_value, y.id_val
from dbo.UDF_GEN_SEQUENCE (@max_val, 1) as x
left outer join @TABLE as y
on x.data_value = y.ID_VAL
where y.ID_VAL is null

And here is the output:

data_value    id_val
4             NULL
6             NULL
8             NULL
12            NULL
13            NULL
14            NULL

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.