capypad
0 day streak
sql / expert
Snippet

Solving Gaps and Islands with Sequence Correlation

The 'Gaps and Islands' problem involves finding contiguous sequences in data. By subtracting a monotonically increasing row number from a sorted sequence of values, all members of a continuous 'island' will result in the same constant value ('grp'). This constant can then be used to group and find the boundaries of each sequence.

snippet.sql
sql
1
2
3
4
5
6
SELECT grp, MIN(val) AS start_range, MAX(val) AS end_range
FROM (
SELECT val, val - ROW_NUMBER() OVER(ORDER BY val) AS grp
FROM sequences
) t
GROUP BY grp;
Breakdown
1
ROW_NUMBER() OVER(ORDER BY val)
Generates a continuous integer sequence based on the sorted values.
2
val - ROW_NUMBER() ... AS grp
Creates a constant identifier for contiguous values in the sequence.
3
SELECT grp, MIN(val), MAX(val) ... GROUP BY grp
Aggregates the groups to find the start and end of each contiguous range.