MySQL 查找重复数据
简介:在本教程中,您将学习如何在MySQL中查找一个或多个列的重复值。
在我们开始之前
由于许多原因,重复数据库在数据库中会经常发生。查找重复值是使用数据库时必须处理的重要任务之一。 对于演示中,我们将创建一个表命名为contacts
四列:id
,first_name
,last_name
,和email
。
DROP TABLE IF EXISTS contacts_test; CREATE TABLE contacts_test ( id INT PRIMARY KEY AUTO_INCREMENT, first_name VARCHAR(50) NOT NULL, last_name VARCHAR(50) NOT NULL, email VARCHAR(255) NOT NULL );以下语句将行插入
contacts_test
表中:
INSERT INTO contacts_test (first_name,last_name,email) VALUES ('Carine ','Schmitt','carine.schmitt@verizon.net'), ('Jean','King','jean.king@me.com'), ('Peter','Ferguson','peter.ferguson@google.com'), ('Janine ','Labrune','janine.labrune@aol.com'), ('Jonas ','Bergulfsen','jonas.bergulfsen@mac.com'), ('Janine ','Labrune','janine.labrune@aol.com'), ('Susan','Nelson','susan.nelson@comcast.net'), ('Zbyszek ','Piestrzeniewicz','zbyszek.piestrzeniewicz@att.net'), ('Roland','Keitel','roland.keitel@yahoo.com'), ('Julie','Murphy','julie.murphy@yahoo.com'), ('Kwai','Lee','kwai.lee@google.com'), ('Jean','King','jean.king@me.com'), ('Susan','Nelson','susan.nelson@comcast.net'), ('Roland','Keitel','roland.keitel@yahoo.com');
SELECT * FROM contacts_test;
+----+------------+-----------------+---------------------------------+ | id | first_name | last_name | email | +----+------------+-----------------+---------------------------------+ | 1 | Carine | Schmitt | carine.schmitt@verizon.net | | 2 | Jean | King | jean.king@me.com | | 3 | Peter | Ferguson | peter.ferguson@google.com | | 4 | Janine | Labrune | janine.labrune@aol.com | | 5 | Jonas | Bergulfsen | jonas.bergulfsen@mac.com | | 6 | Janine | Labrune | janine.labrune@aol.com | | 7 | Susan | Nelson | susan.nelson@comcast.net | | 8 | Zbyszek | Piestrzeniewicz | zbyszek.piestrzeniewicz@att.net | | 9 | Roland | Keitel | roland.keitel@yahoo.com | | 10 | Julie | Murphy | julie.murphy@yahoo.com | | 11 | Kwai | Lee | kwai.lee@google.com | | 12 | Jean | King | jean.king@me.com | | 13 | Susan | Nelson | susan.nelson@comcast.net | | 14 | Roland | Keitel | roland.keitel@yahoo.com | +----+------------+-----------------+---------------------------------+ 14 rows in set (0.00 sec)在
contacts_test
表中,我们有一些行是有重复的值first_name
,last_name
和email
列。让我们学习如何找到它们。
查找一列中的重复值
在基于一列的表中查找重复值,您使用以下语句:SELECT col, COUNT(col) FROM table_name GROUP BY col HAVING COUNT(col) > 1;如果在表中出现多个值,则认为值是重复的。在这个语句中,我们使用
GROUP BY
带有COUNT
函数的子句来计算指定列(col
)的值。HAVING
子句中的条件仅包括值count大于1的行,即重复的行。
您可以使用此查询查找contacts
表中包含重复电子邮件的所有行,如下所示:
SELECT email, COUNT(email) FROM contacts_test GROUP BY email HAVING COUNT(email) > 1;以下显示了查询的输出:
+--------------------------+--------------+ | email | COUNT(email) | +--------------------------+--------------+ | janine.labrune@aol.com | 2 | | jean.king@me.com | 2 | | roland.keitel@yahoo.com | 2 | | susan.nelson@comcast.net | 2 | +--------------------------+--------------+ 4 rows in set (0.00 sec)如您所见,有些行具有相同的电子邮件。
查找多列中重复值
有时,您希望基于多个列而不是一个列找到重复项。在这种情况下,您可以使用以下查询:SELECT col1, COUNT(col1), col2, COUNT(col2), ... FROM table_name GROUP BY col1, col2, ... HAVING (COUNT(col1) > 1) AND (COUNT(col2) > 1) AND ...仅当列的组合重复时,行才被视为重复,因此我们
AND
在HAVING
子句中使用了运算符。
例如,要查找的行contacts_test
表中重复值first_name
,last_name
和email
列,可以使用下面的查询:
SELECT first_name, COUNT(first_name), last_name, COUNT(last_name), email, COUNT(email) FROM contacts_test GROUP BY first_name , last_name , email HAVING COUNT(first_name) > 1 AND COUNT(last_name) > 1 AND COUNT(email) > 1;以下说明了查询的输出:
+------------+-------------------+-----------+------------------+--------------------------+--------------+ | first_name | COUNT(first_name) | last_name | COUNT(last_name) | email | COUNT(email) | +------------+-------------------+-----------+------------------+--------------------------+--------------+ | Janine | 2 | Labrune | 2 | janine.labrune@aol.com | 2 | | Jean | 2 | King | 2 | jean.king@me.com | 2 | | Roland | 2 | Keitel | 2 | roland.keitel@yahoo.com | 2 | | Susan | 2 | Nelson | 2 | susan.nelson@comcast.net | 2 | +------------+-------------------+-----------+------------------+--------------------------+--------------+ 4 rows in set (0.00 sec)在本教程中,您学习了如何根据MySQL中一列或多列的值查找重复行。