1.4 模块化方法
视图可用于以模块化方法开发解决方案。问题的每一步都用一个查询解决,并根据查询定义视图。这样一次只关注其中的某一步,从而简化了解决方案。
我将通过一个示例演示模块化方法。首先,运行代码清单5-1中的代码创建并填充Sales表。
代码清单 5-1创建并填充Sales表
SET NOCOUNT ON;
USE tempdb;
GO
IF OBJECT_ID(‘dbo.Sales’) IS NOT NULL
DROP TABLE dbo.Sales;
GO CREATE TABLE dbo.Sales
(
mnth DATETIME NOT NULL PRIMARY KEY,
qty INT NOT NULL
);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20041201’, 100);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20050101’, 110);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20050201’, 120);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20050301’, 130);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20050401’, 140);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20050501’, 140);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20050601’, 130);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20050701’, 120);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20050801’, 110);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20050901’, 100);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20051001’, 110);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20051101’, 100);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20051201’, 120);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20060101’, 130);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20060201’, 140);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20060301’, 100);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20060401’, 100);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20060501’, 100);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20060601’, 110);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20060701’, 120);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20060801’, 110);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20060901’, 120);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20061001’, 130);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20061101’, 140);
INSERT INTO dbo.Sales(mnth, qty) VALUES(‘20061201’, 100);
该表每个月占一行,其中包含销售数量(qty列)和月份(mnth列)。要注意我使用了DATETIME数据类型存储月份以支持日相关的计算。尽管我只关心其中的年份和月份,我也必须在日部分指定一些值。所以我把每个月的第一天作为日部分,并假设子夜为默认时间。当你需要显示日期时,你总是可以从完整的时间值中提取需要的元素。
现在的任务是返回销售趋势相同的连续月份。也就是说,标记出趋势(up、same、down或unkown)相同的月份范围。某月份的趋势是用它的qty值减去上月的qty值得出的。如果差额为正,趋势为‘up’;如果为负,趋势为‘down’;如果差额为0,则趋势为‘same’;否则,趋势为‘unkown’。表5-5显示了期望的结果。
表5-5 相同销售趋势的月份范围
|
start_range |
end_range |
trend |
|
200412 |
200412 |
unknown |
|
200501 |
200504 |
up |
|
200505 |
200505 |
same |
|
200506 |
200509 |
down |
|
200510 |
200510 |
up |
|
200511 |
200511 |
down |
续表
|
start_range |
end_range |
trend |
|
200512 |
200602 |
up |
|
200603 |
200603 |
down |
|
200604 |
200605 |
same |
|
200606 |
200607 |
up |
|
200608 |
200608 |
down |
|
200609 |
200611 |
up |
|
200612 |
200612 |
down |
用单个查询解决这个问题可能太过于复杂。可以把解决方案分成多个步骤。我先演示在SQL Server 2000中的解决方案,然后使用SQL Server 2005中的新功能改进它。
先计算出当前月份数量与上月数量差额的标记,可以通过创建VSgn视图来完成该计算,就像这样:
IF OBJECT_ID(‘dbo.VSgn’) IS NOT NULL
DROP VIEW dbo.VSgn;
GO
CREATE VIEW dbo.VSgn
AS
SELECT mnth, qty,
SIGN((S1.qty –
(SELECT TOP 1 qty
FROM dbo.Sales AS S2
WHERE S2.mnth < S1.mnth
ORDER BY S2.mnth DESC))) AS sgn
FROM dbo.Sales AS S1
GO
还记得吗?在SQL Server 2000中不能用分号结束CREATE VIEW语句,也不能用圆括号作为TOP的输入。VSgn视图的内容如表5-6所示。
有5-6 VSgn 的内容
|
mnth |
qty |
sgn |
|
2004-12-01 00:00:00.000 |
100 |
NULL |
|
2005-01-01 00:00:00.000 |
110 |
1 |
|
2005-02-01 00:00:00.000 |
120 |
1 |
|
2005-03-01 00:00:00.000 |
130 |
1 |
|
2005-04-01 00:00:00.000 |
140 |
1 |
|
2005-05-01 00:00:00.000 |
140 |
0 |
|
2005-06-01 00:00:00.000 |
130 |
-1 |
|
2005-07-01 00:00:00.000 |
120 |
-1 |
续表
|
mnth |
qty |
sgn |
|
2005-08-01 00:00:00.000 |
110 |
-1 |
|
2005-09-01 00:00:00.000 |
100 |
-1 |
|
2005-10-01 00:00:00.000 |
110 |
1 |
|
2005-11-01 00:00:00.000 |
100 |
-1 |
|
2005-12-01 00:00:00.000 |
120 |
1 |
|
2006-01-01 00:00:00.000 |
130 |
1 |
|
2006-02-01 00:00:00.000 |
140 |
1 |
|
2006-03-01 00:00:00.000 |
100 |
-1 |
|
2006-04-01 00:00:00.000 |
100 |
0 |
|
2006-05-01 00:00:00.000 |
100 |
0 |
|
2006-06-01 00:00:00.000 |
110 |
1 |
|
2006-07-01 00:00:00.000 |
120 |
1 |
|
2006-08-01 00:00:00.000 |
110 |
-1 |
|
2006-09-01 00:00:00.000 |
120 |
1 |
|
2006-10-01 00:00:00.000 |
130 |
1 |
|
2006-11-01 00:00:00.000 |
140 |
1 |
|
2006-12-01 00:00:00.000 |
100 |
-1 |
SIGN函数在输入为正时返回1,输入为0时返回0,输入为负时返回-1,对于NULL输入返回NULL。sgn列实际上表示当前月份的销售趋势。这时,你想对具有相同销售趋势的连续月份分组。为此,你首先需要计算分组因子(grouping factor)——用于标识该组的一个值。趋势不同于当前月份趋势的最早月份可以作为分组因子。如果你考虑一下,就会发现对于相同趋势的连续月份来说,这个值是相同的。
运行下面的代码创建VGrp视图,它将计算分组因子。
IF OBJECT_ID(‘dbo.VGrp’) IS NOT NULL
DROP VIEW dbo.VGrp;
GO
CREATE VIEW dbo.VGrp
AS
SELECT mnth, sgn,
(SELECT MIN(mnth) FROM dbo.VSgn AS V2
WHERE V2.sgn <> V1.sgn
AND V2.mnth > V1.mnth) AS grp
FROM dbo.VSgn AS V1
GO
VGrp视图的内容如表5-7所示。
表5-7 VGrp 的内容
|
mnth |
sgn |
Grp |
|
2004-12-01 00:00:00.000 |
NULL |
NULL |
|
2005-01-01 00:00:00.000 |
1 |
2005-05-01 00:00:00.000 |
|
2005-02-01 00:00:00.000 |
1 |
2005-05-01 00:00:00.000 |
|
2005-03-01 00:00:00.000 |
1 |
2005-05-01 00:00:00.000 |
|
2005-04-01 00:00:00.000 |
1 |
2005-05-01 00:00:00.000 |
|
2005-05-01 00:00:00.000 |
0 |
2005-06-01 00:00:00.000 |
|
2005-06-01 00:00:00.000 |
-1 |
2005-10-01 00:00:00.000 |
|
2005-07-01 00:00:00.000 |
-1 |
2005-10-01 00:00:00.000 |
|
2005-08-01 00:00:00.000 |
-1 |
2005-10-01 00:00:00.000 |
|
2005-09-01 00:00:00.000 |
-1 |
2005-10-01 00:00:00.000 |
|
2005-10-01 00:00:00.000 |
1 |
2005-11-01 00:00:00.000 |
|
2005-11-01 00:00:00.000 |
-1 |
2005-12-01 00:00:00.000 |
|
2005-12-01 00:00:00.000 |
1 |
2006-03-01 00:00:00.000 |
|
2006-01-01 00:00:00.000 |
1 |
2006-03-01 00:00:00.000 |
|
2006-02-01 00:00:00.000 |
1 |
2006-03-01 00:00:00.000 |
|
2006-03-01 00:00:00.000 |
-1 |
2006-04-01 00:00:00.000 |
|
2006-04-01 00:00:00.000 |
0 |
2006-06-01 00:00:00.000 |
|
2006-05-01 00:00:00.000 |
0 |
2006-06-01 00:00:00.000 |
|
2006-06-01 00:00:00.000 |
1 |
2006-08-01 00:00:00.000 |
|
2006-07-01 00:00:00.000 |
1 |
2006-08-01 00:00:00.000 |
|
2006-08-01 00:00:00.000 |
-1 |
2006-09-01 00:00:00.000 |
|
2006-09-01 00:00:00.000 |
1 |
2006-12-01 00:00:00.000 |
|
2006-10-01 00:00:00.000 |
1 |
2006-12-01 00:00:00.000 |
|
2006-11-01 00:00:00.000 |
1 |
2006-12-01 00:00:00.000 |
|
2006-12-01 00:00:00.000 |
-1 |
NULL |
你可以看到,grp列的值对于每组具有相同趋势的连续月份组来说是独一无二的。唯一的例外是两个NULL。2004年12月份是个NULL,因为在该月份表示一个未知的趋势,2006年12月份也是一个NULL,这是因为这个月份之后没有数据。两个NULL属于两个不同的连续趋势组,但通过用sgn(表示趋势)和grp定义组,你可以很容易地解决这个问题。
后面的部分很简单。按sgn和grp分组,返回MIN(mnth)作为范围的开始,返回MAX(mnth)作为范围的结束。并使用CASE表达式把sgn值转换到更具描述性的趋势表现形式。
运行下面的代码创建Vtrends视图以实现这一步。
IF OBJECT_ID(‘dbo.Vtrends’) IS NOT NULL
DROP VIEW dbo.Vtrends;
GO
CREATE VIEW dbo.Vtrends
AS
SELECT
CONVERT(VARCHAR(6), MIN(mnth), 112) AS start_range,
CONVERT(VARCHAR(6), MAX(mnth), 112) AS end_range,
CASE sgn
WHEN -1 THEN ‘down’
WHEN 0 THEN ‘same’
WHEN 1 THEN ‘up’
ELSE ‘unknown’
END AS trend
FROM dbo.VGrp
GROUP BY sgn, grp
GO
如果你使用下面的代码查询Vtrends,将得到前面表5-5所示的期望结果。
SELECT start_range, end_range, trend
FROM dbo.Vtrends
ORDER BY start_range;
在SQL Server 2005中,有几个新特性可以优化该解决方案。先创建一个名为VsalesRN的视图,并包含按mnth排序的行号。
IF OBJECT_ID(‘dbo.VsalesRN’) IS NOT NULL
DROP VIEW dbo.VsalesRN;
GO
CREATE VIEW dbo.VsalesRN
AS
SELECT mnth, qty, ROW_NUMBER() OVER(ORDER BY mnth) AS rn
FROM dbo.Sales;
GO
视图的内容如表5-8所示。
表5-8 VsalesRN 的内容
|
mnth |
qty |
rn |
|
2004-12-01 00:00:00.000 |
100 |
1 |
|
2005-01-01 00:00:00.000 |
110 |
2 |
|
2005-02-01 00:00:00.000 |
120 |
3 |
|
2005-03-01 00:00:00.000 |
130 |
4 |
|
2005-04-01 00:00:00.000 |
140 |
5 |
|
2005-05-01 00:00:00.000 |
140 |
6 |
|
2005-06-01 00:00:00.000 |
130 |
7 |
续表
|
mnth |
qty |
rn |
|
2005-07-01 00:00:00.000 |
120 |
8 |
|
2005-08-01 00:00:00.000 |
110 |
9 |
|
2005-09-01 00:00:00.000 |
100 |
10 |
|
2005-10-01 00:00:00.000 |
110 |
11 |
|
2005-11-01 00:00:00.000 |
100 |
12 |
|
2005-12-01 00:00:00.000 |
120 |
13 |
|
2006-01-01 00:00:00.000 |
130 |
14 |
|
2006-02-01 00:00:00.000 |
140 |
15 |
|
2006-03-01 00:00:00.000 |
100 |
16 |
|
2006-04-01 00:00:00.000 |
100 |
17 |
|
2006-05-01 00:00:00.000 |
100 |
18 |
|
2006-06-01 00:00:00.000 |
110 |
19 |
|
2006-07-01 00:00:00.000 |
120 |
20 |
|
2006-08-01 00:00:00.000 |
110 |
21 |
|
2006-09-01 00:00:00.000 |
120 |
22 |
|
2006-10-01 00:00:00.000 |
130 |
23 |
|
2006-11-01 00:00:00.000 |
140 |
24 |
|
2006-12-01 00:00:00.000 |
100 |
25 |
在视图VSgn中,联接VsalesRN的两个实例并用上一个月份的行匹配当前行。然后访问当前月份和上一个月份的gty值,并计算差额的标记。下面是SQL Server 2005中VSgn视图的代码。
IF OBJECT_ID(‘dbo.VSgn’) IS NOT NULL
DROP VIEW dbo.VSgn;
GO
CREATE VIEW dbo.VSgn
AS
SELECT Cur.mnth, Cur.qty, SIGN(Cur.qty – Prv.qty) AS sgn
FROM dbo.VsalesRN AS Cur
LEFT OUTER JOIN dbo.VSalesRN AS Prv
ON Cur.rn = Prv.rn + 1;
GO
通过修改VGrp视图,你可以进一步优化该方案, 使用下面的方法计算分组因子。
IF OBJECT_ID(‘dbo.VGrp’) IS NOT NULL
DROP VIEW dbo.VGrp;
GO
CREATE VIEW dbo.VGrp
AS
SELECT mnth, sgn,
DATEADD(month,
-ROW_NUMBER() OVER(PARTITION BY sgn ORDER BY mnth),
mnth) AS grp
FROM dbo.VSgn;
GO YYY
计算分组因子的逻辑有点复杂。根据sgn(趋势)分区并按mnth的顺序计算出行号(rn)。它的意思是,对于每一种趋势,你可以有多个连续的组,它们之间会有间断。试着思考一下在某个趋势中随着rn的递增,mnth值如何变化。只要它们还位于同一个连续组中,它们都会以1为单位递增。一旦出现间断,mnth的递增量会大于1,而rn的递增量还是1。你可以得出这样的结论,如果你从mnth减去rn个月份,每个连续组的结果将成为恒量且唯一。就像我提到的,这个逻辑有点复杂,不太好掌握。要更好地理解它,我建议你把VGrp的查询单独拿出来研究。例如,返回行号本身(不再让它参与计算),等等。
最后,创建视图Vtrends以按sgn和grp对数据分组,返回具有相同趋势的连续月份范围。
IF OBJECT_ID(‘dbo.Vtrends’) IS NOT NULL
DROP VIEW dbo.Vtrends;
GO
CREATE VIEW dbo.Vtrends
AS
SELECT
CONVERT(VARCHAR(6), MIN(mnth), 112) AS start_range,
CONVERT(VARCHAR(6), MAX(mnth), 112) AS end_range,
CASE sgn
WHEN -1 THEN ‘down’
WHEN 0 THEN ‘same’
WHEN 1 THEN ‘up’
ELSE ‘unknown’
END AS trend
FROM dbo.VGrp
GROUP BY sgn, grp;
GO
查询Vtrends视图,你会得到期望的结果。
SELECT start_range, end_range, trend
FROM dbo.Vtrends
ORDER BY start_range;
SQL Server 2005支持CTE,它也允许你使用模块化方法开发解决方案。实际上,你可以把CTE 看作是只存在于外部查询的内联视图。其实除了允许模块化开发方法,没有理由再在解决方案中创建中间视图(intermediate view)。相反,你只需要创建最后这个视图(Vtrends)。那为什么还要在数据库中把它们创建为对象呢?在SQL Server 2000
中,你没有更好的选择,但在SQL Server 2005中你可以只创建一个由CTE定义的视图(Vtrends),这些CTE使用模块化方法开发。运行代码清单5-2中的代码修改Vtrends视图,用在同一个WITH语句中定义的多个CTE实现该视图,从而替代原来定义的多个视图。
代码清单5-2 用CTE实现Vtrends
ALTER VIEW dbo.Vtrends
AS
WITH CsalesRN AS
(
SELECT mnth, qty, ROW_NUMBER() OVER(ORDER BY mnth) AS rn
FROM dbo.Sales
),
CSgn AS
(
SELECT Cur.mnth, Cur.qty, SIGN(Cur.qty – Prv.qty) AS sgn
FROM CsalesRN AS Cur
LEFT OUTER JOIN CsalesRN AS Prv
ON Cur.rn = Prv.rn + 1
),
CGrp AS
(
SELECT mnth, sgn,
DATEADD(month,
-ROW_NUMBER() OVER(PARTITION BY sgn ORDER BY mnth),
mnth) AS grp
FROM CSgn
)
SELECT
CONVERT(VARCHAR(6), MIN(mnth), 112) AS start_range,
CONVERT(VARCHAR(6), MAX(mnth), 112) AS end_range,
CASE sgn
WHEN -1 THEN ‘down’
WHEN 0 THEN ‘same’
WHEN 1 THEN ‘up’
ELSE ‘unknown’
END AS trend
FROM CGrp
GROUP BY sgn, grp;
GO
如果你使用下面的代码查询Vtrends,将得到前面表5-5所示的理想结果。
SELECT start_range, end_range, trend
FROM dbo.Vtrends
ORDER BY start_range;
总之,使用模块化方法开发解决方案简化了开发过程并减少出现bug和错误的可能。
完成后,运行下面的代码进行清理:
IF OBJECT_ID(‘dbo.Vtrends’) IS NOT NULL
DROP VIEW dbo.Vtrends;
GO
IF OBJECT_ID(‘dbo.VGrp’) IS NOT NULL
DROP VIEW dbo.VGrp;
GO
IF OBJECT_ID(‘dbo.VSgn’) IS NOT NULL
DROP VIEW dbo.VSgn;
GO
IF OBJECT_ID(‘dbo.VsalesRN’) IS NOT NULL
DROP VIEW dbo.VsalesRN;
GO
IF OBJECT_ID(‘dbo.Sales’) IS NOT NULL
DROP TABLE dbo.Sales;
GO