一. 刪除完全重復的記錄
完全重復的數據,通常是由于沒有設置主鍵/唯一鍵約束導致的。
測試數據:
復制代碼 代碼如下:
if OBJECT_ID('duplicate_all') is not null
drop table duplicate_all
GO
create table duplicate_all
(
c1 int,
c2 int,
c3 varchar(100)
)
GO
insert into duplicate_all
select 1,100,'aaa' union all
select 1,100,'aaa' union all
select 1,100,'aaa' union all
select 1,100,'aaa' union all
select 1,100,'aaa' union all
select 2,200,'bbb' union all
select 3,300,'ccc' union all
select 4,400,'ddd' union all
select 5,500,'eee'
GO
(1) 借助臨時表
利用DISTINCT得到單條記錄,刪除源數據,然后導回不重復記錄。
如果表不大的話,可以把所有記錄導出一次,然后truncate表后再導回,這樣可以避免delete的日志操作。
復制代碼 代碼如下:
if OBJECT_ID('tempdb..#tmp') is not null
drop table #tmp
GO
select distinct * into #tmp
from duplicate_all
where c1 = 1
GO
delete duplicate_all where c1 = 1
GO
insert into duplicate_all
select * from #tmp
復制代碼 代碼如下:
with tmp
as
(
select *,ROW_NUMBER() OVER(PARTITION BY c1,c2,c3 ORDER BY(getdate())) as num
from duplicate_all
where c1 = 1
)
delete tmp where num > 1
如果多個表有完全重復的行,可以考慮通過UNION將多個表聯合,插到一個新的同結構的表,SQL Server會幫助去掉表和表之間的重復行。
二. 刪除部分重復的記錄
部分列重復的數據,通常表上是有主鍵的,可能是程序邏輯造成了多行數據列值的重復。
測試數據:
復制代碼 代碼如下:
if OBJECT_ID('duplicate_col') is not null
drop table duplicate_col
GO
create table duplicate_col
(
c1 int primary key,
c2 int,
c3 varchar(100)
)
GO
insert into duplicate_col
select 1,100,'aaa' union all
select 2,100,'aaa' union all
select 3,100,'aaa' union all
select 4,100,'aaa' union all
select 5,500,'eee'
GO
(1) 唯一索引
唯一索引有個忽略重復建的選項,在創建主鍵約束/唯一鍵約束時都可以使用這個索引選項。
復制代碼 代碼如下:
if OBJECT_ID('tmp') is not null
drop table tmp
GO
create table tmp
(
c1 int,
c2 int,
c3 varchar(100),
constraint UQ_01 unique(c2,c3) with(IGNORE_DUP_KEY = ON)
)
GO
insert into tmp
select * from duplicate_col
select * from tmp
復制代碼 代碼如下:
delete from duplicate_col
where exists(select 1 from duplicate_col b where duplicate_col.c1 > b.c1 and (duplicate_col.c2 = b.c2 and duplicate_col.c3 = b.c3))
復制代碼 代碼如下:
delete from duplicate_col
where c1 not in (select min(c1) from duplicate_col group by c2,c3)
復制代碼 代碼如下:
with tmp
as
(
select *,ROW_NUMBER() OVER(PARTITION BY c2,c3 ORDER BY(getdate())) as num
from duplicate_col
)
delete tmp where num > 1
select * from duplicate_col
新聞熱點
疑難解答