MySQL 分组后取每组前N条数据

作者:15354045@qq.com

MySQL 分组后取每组前N条数据

与oracle的 rownumber() over(partition by xxx  order by xxx )语句类似,即:对表分组后排序

CREATE TABLE `mygoods` (  
  `goods_id` int(11) unsigned NOT NULL AUTO_INCREMENT,  
  `cat_id` int(11) NOT NULL DEFAULT '0',  
  `price` tinyint(3) NOT NULL DEFAULT '0',  
  `status` tinyint(3) DEFAULT '1',  
  PRIMARY KEY (`goods_id`),  
  KEY `icatid` (`cat_id`)  
) ENGINE=InnoDB  DEFAULT CHARSET=utf8;  
  
INSERT INTO `mygoods` VALUES (1, 101, 90, 0);  
INSERT INTO `mygoods` VALUES (2, 101, 99, 1);  
INSERT INTO `mygoods` VALUES (3, 102, 98, 0);  
INSERT INTO `mygoods` VALUES (4, 103, 96, 0);  
INSERT INTO `mygoods` VALUES (5, 102, 95, 0);  
INSERT INTO `mygoods` VALUES (6, 102, 94, 1);  
INSERT INTO `mygoods` VALUES (7, 102, 93, 1);  
INSERT INTO `mygoods` VALUES (8, 103, 99, 1);  
INSERT INTO `mygoods` VALUES (9, 103, 98, 1);  
INSERT INTO `mygoods` VALUES (10, 103, 97, 1);  
INSERT INTO `mygoods` VALUES (11, 104, 96, 1);  
INSERT INTO `mygoods` VALUES (12, 104, 95, 1);  
INSERT INTO `mygoods` VALUES (13, 104, 94, 1);  
INSERT INTO `mygoods` VALUES (15, 101, 92, 1);  
INSERT INTO `mygoods` VALUES (16, 101, 93, 1);  
INSERT INTO `mygoods` VALUES (17, 101, 94, 0);  
INSERT INTO `mygoods` VALUES (18, 102, 99, 1);  
INSERT INTO `mygoods` VALUES (19, 105, 85, 1);  
INSERT INTO `mygoods` VALUES (20, 105, 89, 0);  
INSERT INTO `mygoods` VALUES (21, 105, 99, 1);  

说明:

表mygoods为商品表,cat_id为分类id,goods_id为商品id,status为商品当前的状态位(1:有效,0:无效)。

需求:每个分类下,找出两个价格最高的有效的商品。

1.每个分类找出价格最高的两个商品

mysql> select a.*   
    -> from mygoods a   
    -> where (select count(*) 
    -> from mygoods 
    -> where cat_id = a.cat_id and price > a.price  ) <2 
    -> order by a.cat_id,a.price desc;

SQL解析:cat_id = a.cat_id  --是为了确定分组字段price > a.price  --确定排序条件,由于是大于,则最大出现0次,第二大的出现1次,所以count(*)<2

+----------+--------+-------+--------+
| goods_id | cat_id | price | status |
+----------+--------+-------+--------+
|        2 |    101 |    99 |      1 |
|       17 |    101 |    94 |      0 |
|       18 |    102 |    99 |      1 |
|        3 |    102 |    98 |      0 |
|        8 |    103 |    99 |      1 |
|        9 |    103 |    98 |      1 |
|       11 |    104 |    96 |      1 |
|       12 |    104 |    95 |      1 |
|       21 |    105 |    99 |      1 |
|       20 |    105 |    89 |      0 |
+----------+--------+-------+--------+
10 rows in set (0.00 sec)

 

2.每个分类找出价格最高的有效的两个商品(正确)

mysql> select a.* 
    -> from mygoods a 
    -> where (select count(*) from mygoods 
    -> where cat_id = a.cat_id and price > a.price and status=1  ) <2 
    -> and status=1 
    -> order by a.cat_id,a.price desc ;
+----------+--------+-------+--------+
| goods_id | cat_id | price | status |
+----------+--------+-------+--------+
|        2 |    101 |    99 |      1 |
|       16 |    101 |    93 |      1 |
|       18 |    102 |    99 |      1 |
|        6 |    102 |    94 |      1 |
|        8 |    103 |    99 |      1 |
|        9 |    103 |    98 |      1 |
|       11 |    104 |    96 |      1 |
|       12 |    104 |    95 |      1 |
|       21 |    105 |    99 |      1 |
|       19 |    105 |    85 |      1 |
+----------+--------+-------+--------+
10 rows in set (0.00 sec)
3.每个分类找出价格最高的有效的两个商品(正确)
mysql> select a.* 
    -> from mygoods a 
    -> left join mygoods b 
    -> on a.cat_id = b.cat_id and a.price < b.price and b.status=1
    -> where a.status=1
    -> group by a.goods_id,a.cat_id,a.price
    -> having count(b.goods_id) < 2
    -> order by a.cat_id,a.price desc;
+----------+--------+-------+--------+
| goods_id | cat_id | price | status |
+----------+--------+-------+--------+
|        2 |    101 |    99 |      1 | 
|       16 |    101 |    93 |      1 | 
|       18 |    102 |    99 |      1 | 
|        6 |    102 |    94 |      1 | 
|        8 |    103 |    99 |      1 | 
|        9 |    103 |    98 |      1 | 
|       11 |    104 |    96 |      1 | 
|       12 |    104 |    95 |      1 | 
|       21 |    105 |    99 |      1 | 
|       19 |    105 |    85 |      1 | 
+----------+--------+-------+--------+
10 rows in set (0.00 sec)
4.每个分类找出价格最高的有效的两个商品(错误)
mysql> select a.* 
    -> from mygoods a 
    -> where (select count(*) from mygoods 
    -> where cat_id = a.cat_id and price > a.price  ) <2 and status=1 
    -> order by a.cat_id,a.price desc;
+----------+--------+-------+--------+
| goods_id | cat_id | price | status |
+----------+--------+-------+--------+
|        2 |    101 |    99 |      1 |
|       18 |    102 |    99 |      1 |
|        8 |    103 |    99 |      1 |
|        9 |    103 |    98 |      1 |
|       11 |    104 |    96 |      1 |
|       12 |    104 |    95 |      1 |
|       21 |    105 |    99 |      1 |
+----------+--------+-------+--------+
7 rows in set (0.00 sec)
由上可知,如果需要增加条件的话,需要在两处增加条件。
 

可以将每个分组下的goods_id合并。

mysql> select cat_id,GROUP_CONCAT(goods_id) from mygoods group by cat_id;
+--------+------------------------+
| cat_id | GROUP_CONCAT(goods_id) |
+--------+------------------------+
|    101 | 1,2,15,16,17           |
|    102 | 3,5,6,7,18             |
|    103 | 4,8,9,10               |
|    104 | 11,12,13               |
|    105 | 19,20,21               |
+--------+------------------------+
5 rows in set (0.00 sec)

https://blog.csdn.net/come_on_air/article/details/72902592

MySQL取每组的前N条记录:使用自连接的方式

 

一、对分组的记录取前N条记录:例子:取前 2条最大(小)的记录

1.用子查询: SELECT
    *
FROM
    right2 a
WHERE
    2 > (
        SELECT
            COUNT(*)
        FROM
            right2 b
        WHERE
            b.id = a.id
        AND b.account > a.account
    )
ORDER BY
    a.id,
    a.account DESC 2.用exists半连接: SELECT
        *
    FROM
        right2 a
    WHERE
        EXISTS (
            SELECT
                COUNT(*)
            FROM
                right2 b
            WHERE
                b.id = a.id
            AND a.account < b.account
            HAVING
                COUNT(*) < 2
        )
    ORDER BY
        a.id,
        a.account DESC 同理可以取组内最小的N条记录: SELECT
            *
        FROM
            right2 a
        WHERE
            2 > (
                SELECT
                    COUNT(*)
                FROM
                    right2 b
                WHERE
                    b.id = a.id
                AND b.account < a.account
            )
        ORDER BY
            a.id,
            a.account DESC 用exists: SELECT
                *
            FROM
                right2 a
            WHERE
                EXISTS (
                    SELECT
                        COUNT(*)
                    FROM
                        right2 b
                    WHERE
                        b.id = a.id
                    AND a.account > b.account
                    HAVING
                        COUNT(*) < 2
                )
            ORDER BY
                a.id,
                a.account DESC SQLServer支持top - N: SELECT
                    a.*
                FROM
                    tb a
                WHERE
                    val = (
                        SELECT
                            top 3 val
                        FROM
                            tb
                        WHERE
                            NAME = a. NAME
                    )
                ORDER BY
                    a. NAME

 

如果取每组的最大(小)一条记录我常用:

SELECT
    id,
    val
FROM
    t b
INNER JOIN (
    SELECT
        *
    FROM
        t a
    WHERE

    ORDER BY
        val DESC
) a ON a.id = b.id
GROUP BY
    a.id
ORDER BY
    id;

 

二.实例:取每组最大的前 N条          

CREATE TABLE t2 (
    id INT PRIMARY KEY,
    gid CHAR,
    col1 INT,
    col2 INT
) ENGINE = INNODB;

INSERT INTO t2
VALUES
    (1, 'A', 31, 6),
    (2, 'B', 25, 83),
    (3, 'C', 76, 21),
    (4, 'D', 63, 56),
    (5, 'E', 3, 17),
    (6, 'A', 29, 97),
    (7, 'B', 88, 63),
    (8, 'C', 16, 22),
    (9, 'D', 25, 43),
    (10, 'E', 45, 28),
    (11, 'A', 2, 78),
    (12, 'B', 30, 79),
    (13, 'C', 96, 73),
    (14, 'D', 37, 40),
    (15, 'E', 14, 86),
    (16, 'A', 32, 67),
    (17, 'B', 84, 38),
    (18, 'C', 27, 9),
    (19, 'D', 31, 21),
    (20, 'E', 80, 63),
    (21, 'A', 89, 9),
    (22, 'B', 15, 22),
    (23, 'C', 46, 84),
    (24, 'D', 54, 79),
    (25, 'E', 85, 64),
    (26, 'A', 87, 13),
    (27, 'B', 40, 45),
    (28, 'C', 34, 90),
    (29, 'D', 63, 8),
    (30, 'E', 66, 40),
    (31, 'A', 83, 49),
    (32, 'B', 4, 90),
    (33, 'C', 81, 7),
    (34, 'D', 11, 12),
    (35, 'E', 85, 10),
    (36, 'A', 39, 75),
    (37, 'B', 22, 39),
    (38, 'C', 76, 67),
    (39, 'D', 20, 11),
    (40, 'E', 81, 36);

 

取每组gid 最大的前N条记录:使用自连接或则半连接:

*N=1时:

自连接:降序排好后group by取每组最大的一条。

SELECT
    *
FROM
    (
        SELECT
            *
        FROM
            t2
        ORDER BY
            col2 DESC
    ) AS a
GROUP BY
    gid
ORDER BY
    gid;

 

半连接方式:找不到比最大值还大的。

SELECT
    *
FROM
    t2 a
WHERE
    NOT EXISTS (
        SELECT
            1
        FROM
            t2 b
        WHERE
            b.gid = a.gid
        AND b.col2 > a.col2
    )
ORDER BY
    a.gid;

 

 

*N=3时:

自连接:

SELECT
    *
FROM
    t2 a
WHERE
    3 > (
        SELECT
            count(*)
        FROM
            t2
        WHERE
            gid = a.gid
        AND col2 > a.col2
    )
ORDER BY
    a.gid,
    a.col2 DESC;

 

半连接:

SELECT
    *
FROM
    t2 a
WHERE
    EXISTS (
        SELECT
            count(*)
        FROM
            t2 b
        WHERE
            b.gid = a.gid
        AND a.col2 < b.col2
        HAVING
            (count(*)) < 3
    )
ORDER BY
    a.gid,
    a.col2 DESC
Oracle取每组的前N条记录:可以使用分析函数,hive中也能使用
SELECT * FROM(
SELECT z.type , z.code ,ROW_NUMBER()
OVER(PARTITION BY z.type ORDER BY z.code desc) AS code_id
FROM group_info z
)
WHERE code_id <4;   #取每组最大的前四条记录