希望計算每個人在日期區間內的實際天數有幾天,但會遇到日期重複的問題,不知道該如何處理
IF OBJECT_ID('Datas') IS NOT NULL
DROP TABLE Datas
CREATE TABLE Datas
(
ID char(1),
StartDate date,
EndDate date
)
INSERT INTO Datas VALUES
('A','20140101','20140115'),
('A','20140114','20140117'),
('B','20140215','20140220'),
('B','20140220','20140225'),
('C','20140301','20140305'),
('C','20140320','20140321');
- 原始資料
- 呈現結果
傳統作法
;
WITH cteStartDate AS
(
SELECT
DISTINCT ID, startdate
FROM Datas AS S1
WHERE NOT EXISTS
(
SELECT *
FROM Datas AS S2
WHERE S2.ID = S1.ID
AND S2.startdate < S1.startdate
AND S2.enddate >= S1.StartDate
)
)
,
cteEndDate AS
(
SELECT DISTINCT ID, enddate
FROM Datas AS S1
WHERE NOT EXISTS
(
SELECT *
FROM Datas AS S2
WHERE S2.ID = S1.ID
AND S2.enddate > S1.enddate
AND S2.startdate <= S1.enddate)
)
SELECT
ID,
startdate,
(
SELECT MIN(enddate)
FROM cteEndDate AS E
WHERE E.ID = S.ID
AND enddate >= startdate
) AS enddate
FROM cteStartDate AS S
視窗函數作法
該做法只適用於 SQL Server 2012 以上版本
;
WITH cteDate AS
(
SELECT
ID ,
StartDate AS dates ,
1 AS type ,
1 AS sub
FROM Datas
UNION ALL
SELECT
ID ,
EndDate ,
-1 ,
0
FROM Datas
)
,
cteRunningTotal AS
(
SELECT
* ,
SUM(type) OVER
(
PARTITION BY ID
ORDER BY Dates , Type desc
ROWS BETWEEN UNBOUNDed preceding AND current Row
) - sub AS cnt
FROM cteDate
)
,
cteGrp AS
(
SELECT * ,
((ROW_NUMBER() OVER (PARTITION BY ID , cnt ORDER BY Dates) - 1) / 2) + 1 AS Grp
FROM cteRunningTotal
WHERE Cnt = 0
)
SELECT
ID ,
MIN(dates) AS StartDate ,
MAX(dates) AS EndDate
FROM cteGrp
GROUP BY ID , Grp
簡易 Step By Step 分析Step1:cteStartDate
把起始日期和結束日期放在同一個欄位,並新增
- Type 欄位:起始日期該值為 1、結束日期該值為 -1
- GrpStartCheck 欄位:該欄用是用來判斷資料群組的起始日期是哪一筆
Step2:cteRiunningTotal
利用 SQL Server 2012 開始提供的功能,把 Type 欄位進行"累計加總",可以發現
- 當累計加總為資料群組終點,RunningTotal 欄位累計加總為 0
- 因為每個資料區間只會有一筆起始日期,每個資料區間 RunningTotal 的第一筆起始資料必定為 1,當該資料減去 GrpStartCheck 時,資料區間的起始和結束就都是 cnt = 0
Step3:cteGrp
cnt 為 0 的資料代表資料區間的起始和結束,再加上 ROW_NUMBER() 來進行資料分組,每兩筆資料為一個群組
Step4:對資料進行轉置
針對 ID 和 Grp 欄位進行群組,並利用MIX() 和 MAX() 來找出起始和結束日期,就可以得到結果
- 參考資料
- 論壇討論、論壇回應
- 開啟效能魔力窗 T-SQL 視窗函數的實務技巧
- Microsoft SQL Server 2012 High-Performance T-SQL Using Window Functions - P181 Packing Interval 範例
沒有留言:
張貼留言