MySQL – counting repetitions in a specific order

[adsense id=”0514458240″ width=”468″ height=”60″]

I was asked how to count the number of repetitions in a set of ordered data. For example, let us imagine we have the following set of information:

xx 
xx 
xx 
yy 
yy 
xx 
xx 
xx 
yy 
yy 
xy 
yy 
yy 

What we want to do is first summarize this into:

xx 3
yy 2
xx 3
yy 2
xy 1
yy 2

To add more complexity we also want to count the number of iterations of the repetitions. In other words, this set of information should be further summarized into:

xx 3 - 2
yy 2 - 3
xy 1 - 1

In order to solve this we are going to use MySQL user session variables to keep track of the identical data values.

mysql> set @x := 0;
Query OK, 0 rows affected (0.00 sec)

mysql> set @data := "";
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT data,
    ->        IF(@data = data, @x, @x := @x + 1)  AS num,
    ->        IF(@data = data, "", @data := data) AS info
    -> FROM   data
    -> ORDER  BY id;  
+------+------+------+
| data | num  | info |
+------+------+------+
| xx   |    1 | xx   | 
| xx   |    1 |      | 
| xx   |    1 |      | 
| yy   |    2 | yy   | 
| yy   |    2 |      | 
| xx   |    3 | xx   | 
| xx   |    3 |      | 
| xx   |    3 |      | 
| yy   |    4 | yy   | 
| yy   |    4 |      | 
| xy   |    5 | xy   | 
| yy   |    6 | yy   | 
| yy   |    6 |      | 
+------+------+------+
13 rows in set (0.00 sec)

Now we have a set of data with which we can work with, the combination of the first and second field uniquely identify each iterations set. So by counting this information, we have found the summary of the iterations.

mysql> SELECT data,
    ->        COUNT(num)
    -> FROM   (SELECT data,
    ->                IF(@data = data, @x, @x := @x + 1)  AS num,
    ->                IF(@data = data, "", @data := data) AS info
    ->         FROM   data
    ->         ORDER  BY id) x
    -> GROUP  BY data,
    ->           num;
+------+------------+
| data | COUNT(num) |
+------+------------+
| xx   |          3 | 
| xx   |          3 | 
| xy   |          1 | 
| yy   |          2 | 
| yy   |          2 | 
| yy   |          2 | 
+------+------------+
6 rows in set (0.00 sec)

Now the final part to the puzzle is summarizing each of these to show each repetition:

mysql> SELECT data,
    ->        countnum,
    ->        COUNT(1) AS iterations
    -> FROM   (SELECT data,
    ->                COUNT(num) AS countnum
    ->         FROM   (SELECT data,
    ->                        IF(@data = data, @x, @x := @x + 1)  AS num,
    ->                        IF(@data = data, "", @data := data) AS info
    ->                 FROM   data
    ->                 ORDER  BY id) x
    ->         GROUP  BY data,
    ->                   num) y
    -> GROUP  BY data,
    ->           countnum;  
+------+----------+------------+
| data | countnum | iterations |
+------+----------+------------+
| xx   |        3 |          2 | 
| xy   |        1 |          1 | 
| yy   |        2 |          3 | 
+------+----------+------------+
3 rows in set (0.00 sec)