Oracle從8.1.6開始提供分析函數,分析函數用于計算基于組的某種聚合值,它和聚合函數的不同之處是對于每個組返回多行,而聚合函數對于每個組只返回一行。 下面例子中使用的表來自Oracle自帶的HR用戶下的表,假如沒有安裝該用戶,可以在SYS用戶下運行$ORACLE_HOME/demo/schema/human_resources/hr_main.sql來創建。 少數幾個例子需要訪問SH用戶下的表,假如沒有安裝該用戶,可以在SYS用戶下運行$ORACLE_HOME/demo/schema/sales_history/sh_main.sql來創建。 假如未指明缺省是在HR用戶下運行例子。 開窗函數的的理解: 開窗函數指定了分析函數工作的數據窗口大小,這個數據窗口大小可能會隨著行的變化而變化,舉例如下:over(order by salary) 按照salary排序進行累計,order by是個默認的開窗函數over(partition by deptno)按照部門分區over(order by salary range between 50 PReceding and 150 following)每行對應的數據窗口是之前行幅度值不超過50,之后行幅度值不超過150over(order by salary rows between 50 preceding and 150 following)每行對應的數據窗口是之前50行,之后150行over(order by salary rows between unbounded preceding and unbounded following)每行對應的數據窗口是從第一行到最后一行,等效:over(order by salary range between unbounded preceding and unbounded following)主要參考資料:《eXPert one-on-one》 Tom Kyte 《Oracle9i SQL Reference》第6章AVG 功能描述:用于計算一個組和數據窗口內表達式的平均值。SAMPLE:下面的例子中列c_mavg計算員工表中每個員工的平均薪水報告,該平均值由當前員工和與之具有相同經理的前一個和后一個三者的平均數得來;SELECT manager_id, last_name, hire_date, salary, AVG(salary) OVER (PARTITION BY manager_id ORDER BY hire_date ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS c_mavg FROM employees;MANAGER_ID LAST_NAME HIRE_DATE SALARY C_MAVG---------- ------------------------- --------- ---------- ---------- 100 Kochhar 21-SEP-89 17000 17000 100 De Haan 13-JAN-93 17000 15000 100 Raphaely 07-DEC-94 11000 11966.6667 100 Kaufling 01-MAY-95 7900 10633.3333 100 Hartstein 17-FEB-96 13000 9633.33333 100 Weiss 18-JUL-96 8000 11666.6667 100 Russell 01-OCT-96 14000 11833.3333...CORR 功能描述:返回一對表達式的相關系數,它是如下的縮寫: COVAR_POP(expr1,expr2)/STDDEV_POP(expr1)*STDDEV_POP(expr2)) 從統計上講,相關性是變量之間關聯的強度,變量之間的關聯意味著在某種程度 上一個變量的值可由其它的值進行猜測。
通過返回一個-1~1之間的一個數, 相關 系數給出了關聯的強度,0表示不相關。SAMPLE:下例返回1998年月銷售收入和月單位銷售的關系的累積系數(本例在SH用戶下運行)SELECT t.calendar_month_number, CORR (SUM(s.amount_sold), SUM(s.quantity_sold)) OVER (ORDER BY t.calendar_month_number) as CUM_CORR FROM sales s, times tWHERE s.time_id = t.time_id AND calendar_year = 1998GROUP BY t.calendar_month_numberORDER BY t.calendar_month_number;CALENDAR_MONTH_NUMBER CUM_CORR--------------------- ---------- 1 2 1 3 .994309382 4 .852040875 5 .846652204 6 .871250628 7 .910029803 8 .917556399 9 .920154356 10 .86720251 11 .844864765 12 .903542662COVAR_POP 功能描述:返回一對表達式的總體協方差。SAMPLE:下例CUM_COVP返回定價和最小產品價格的累積總體協方差SELECT prodUCt_id, supplier_id, COVAR_POP(list_price, min_price) OVER (ORDER BY product_id, supplier_id) AS CUM_COVP, COVAR_SAMP(list_price, min_price) OVER (ORDER BY product_id, supplier_id) AS CUM_COVS FROM product_information pWHERE category_id = 29ORDER BY product_id, supplier_id;PRODUCT_ID SUPPLIER_ID CUM_COVP CUM_COVS---------- ----------- ---------- ---------- 1774 103088 0 1775 103087 1473.25 2946.5 1794 103096 1702.77778 2554.16667 1825 103093 1926.25 2568.33333 2004 103086 1591.4 1989.25 2005 103086 1512.5 1815 2416 103088 1475.97959 1721.97619..COVAR_SAMP 功能描述:返回一對表達式的樣本協方差SAMPLE:下例CUM_COVS返回定價和最小產品價格的累積樣本協方差SELECT product_id, supplier_id, COVAR_POP(list_price, min_price) OVER (ORDER BY product_id, supplier_id) AS CUM_COVP, COVAR_SAMP(list_price, min_price) OVER (ORDER BY product_id, supplier_id) AS CUM_COVS FROM product_information pWHERE category_id = 29ORDER BY product_id, supplier_id;PRODUCT_ID SUPPLIER_ID CUM_COVP CUM_COVS---------- ----------- ---------- ---------- 1774 103088 0 1775 103087 1473.25 2946.5 1794 103096 1702.77778 2554.16667 1825 103093 1926.25 2568.33333 2004 103086 1591.4 1989.25 2005 103086 1512.5 1815 2416 103088 1475.97959 1721.97619..COUNT 功能描述:對一組內發生的事情進行累積計數,假如指定*或一些非空常數,count將對所有行計數,假如指定一個表達式,count返回表達式非空賦值的計數,當有相同值出現時,這些相等的值都會被納入被計算的值;可以使用DISTINCT來記錄去掉一組中完全相同的數據后出現的行數。
SAMPLE:下面例子中計算每個員工在按薪水排序中當前行四周薪水在[n-50,n+150]之間的行數,n表示當前行的薪水例如,Philtanker的薪水2200,排在他之前的行中薪水大于等于2200-50的有1行,排在他之后的行中薪水小于等于2200+150的行沒有,所以count計數值cnt3為2(包括自己當前行);cnt2值相當于小于等于當前行的SALARY值的所有行數SELECT last_name, salary, COUNT(*) OVER () AS cnt1, COUNT(*) OVER (ORDER BY salary) AS cnt2, COUNT(*) OVER (ORDER BY salary RANGE BETWEEN 50 PRECEDING AND 150 FOLLOWING) AS cnt3 FROM employees;LAST_NAME SALARY CNT1 CNT2 CNT3------------------------- ---------- ---------- ---------- ----------Olson 2100 107 1 3Markle 2200 107 3 2Philtanker 2200 107 3 2Landry 2400 107 5 8Gee 2400 107 5 8Colmenares 2500 107 11 10Patel 2500 107 11 10..CUME_DIST 功能描述:計算一行在組中的相對位置,CUME_DIST總是返回大于0、小于或等于1的數,該數表示該行在N行中的位置。例如,在一個3行的組中,返回的累計分布值為1/3、2/3、3/3SAMPLE:下例中計算每個工種的員工按薪水排序依次累積出現的分布百分比SELECT job_id, last_name, salary, CUME_DIST() OVER (PARTITION BY job_id ORDER BY salary) AS cume_dist FROM employees WHERE job_id LIKE 'PU%';JOB_ID LAST_NAME SALARY CUME_DIST---------- ------------------------- ---------- ----------PU_CLERK Colmenares 2500 .2PU_CLERK Himuro 2600 .4PU_CLERK Tobias 2800 .6PU_CLERK Baida 2900 .8PU_CLERK Khoo 3100 1PU_MAN Raphaely 11000 1DENSE_RANK 功能描述:根據ORDER BY子句中表達式的值,從查詢返回的每一行,計算它們與其它行的相對位置。
組內的數據按ORDER BY子句排序,然后給每一行賦一個號,從而形成一個序列,該序列從1開始,往后累加。每次ORDER BY表達式的值發生變化時,該序列也隨之增加。有同樣值的行得到同樣的數字序號(認為null時相等的)。密集的序列返回的時沒有間隔的數SAMPLE:下例中計算每個員工按部門分區再按薪水排序,依次出現的序列號(注重與RANK函數的區別)SELECT d.department_id , e.last_name, e.salary, DENSE_RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary) as drank FROM employees e, departments dWHERE e.department_id = d.department_id AND d.department_id IN ('60', '90'); DEPARTMENT_ID LAST_NAME SALARY DRANK------------- ------------------------- ---------- ---------- 60 Lorentz 4200 1 60 Austin 4800 2 60 Pataballa 4800 2 60 Ernst 6000 3 60 Hunold 9000 4 90 Kochhar 17000 1 90 De Haan 17000 1 90 King 24000 2FIRST 功能描述:從DENSE_RANK返回的集合中取出排在最前面的一個值的行(可能多行,因為值可能相等),因此完整的語法需要在開始處加上一個集合函數以從中取出記錄SAMPLE:下面例子中DENSE_RANK按部門分區,再按傭金commission_pct排序,FIRST取出傭金最低的對應的所有行,然后前面的MAX函數從這個集合中取出薪水最低的值;LAST取出傭金最高的對應的所有行,然后前面的MIN函數從這個集合中取出薪水最高的值SELECT last_name, department_id, salary, MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY commission_pct) OVER (PARTITION BY department_id) "Worst", MAX(salary) KEEP (DENSE_RANK LAST ORDER BY commission_pct) OVER (PARTITION BY department_id) "Best" FROM employees WHERE department_id in (20,80) ORDER BY department_id, salary;LAST_NAME DEPARTMENT_ID SALARY Worst Best------------------------- ------------- ---------- ---------- ----------Fay 20 6000 6000 13000Hartstein 20 13000 6000 13000Kumar 80 6100 6100 14000Banda 80 6200 6100 14000Johnson 80 6200 6100 14000Ande 80 6400 6100 14000Lee 80 6800 6100 14000Tuvault 80 7000 6100 14000Sewall 80 7000 6100 14000Marvins 80 7200 6100 14000Bates 80 7300 6100 14000...FIRST_VALUE 功能描述:返回組中數據窗口的第一個值。
SAMPLE:下面例子計算按部門分區按薪水排序的數據窗口的第一個值對應的名字,假如薪水的第一個值有多個,則從多個對應的名字中取缺省排序的第一個名字SELECT department_id, last_name, salary, FIRST_VALUE(last_name) OVER (PARTITION BY department_id ORDER BY salary ASC ) AS lowest_sal FROM employees WHERE department_id in(20,30);DEPARTMENT_ID LAST_NAME SALARY LOWEST_SAL------------- ------------------------- ---------- -------------- 20 Fay 6000 Fay 20 Hartstein 13000 Fay 30 Colmenares 2500 Colmenares 30 Himuro 2600 Colmenares 30 Tobias 2800 Colmenares 30 Baida 2900 Colmenares 30 Khoo 3100 Colmenares 30 Raphaely 11000 ColmenaresLAG 功能描述:可以訪問結果集中的其它行而不用進行自連接。它答應去處理游標,就似乎游標是一個數組一樣。在給定組中可參考當前行之前的行,這樣就可以從組中與當前行一起選擇以前的行。Offset是一個正整數,其默認值為1,若索引超出窗口的范圍,就返回默認值(默認返回的是組中第一行),其相反的函數是LEADSAMPLE:下面的例子中列prev_sal返回按hire_date排序的前1行的salary值SELECT last_name, hire_date, salary, LAG(salary, 1, 0) OVER (ORDER BY hire_date) AS prev_sal FROM employeesWHERE job_id = 'PU_CLERK';LAST_NAME HIRE_DATE SALARY PREV_SAL------------------------- ---------- ---------- ----------Khoo 18-5月 -95 3100 0Tobias 24-7月 -97 2800 3100Baida 24-12月-97 2900 2800Himuro 15-11月-98 2600 2900Colmenares 10-8月 -99 2500 2600LAST 功能描述:從DENSE_RANK返回的集合中取出排在最后面的一個值的行(可能多行,因為值可能相等),因此完整的語法需要在開始處加上一個集合函數以從中取出記錄SAMPLE:下面例子中DENSE_RANK按部門分區,再按傭金commission_pct排序,FIRST取出傭金最低的對應的所有行,然后前面的MAX函數從這個集合中取出薪水最低的值;LAST取出傭金最高的對應的所有行,然后前面的MIN函數從這個集合中取出薪水最高的值SELECT last_name, department_id, salary, MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY commission_pct) OVER (PARTITION BY department_id) "Worst", MAX(salary) KEEP (DENSE_RANK LAST ORDER BY commission_pct) OVER (PARTITION BY department_id) "Best" FROM employees WHERE department_id in (20,80) ORDER BY department_id, salary;LAST_NAME DEPARTMENT_ID SALARY Worst Best------------------------- ------------- ---------- ---------- ----------Fay 20 6000 6000 13000Hartstein 20 13000 6000 13000Kumar 80 6100 6100 14000Banda 80 6200 6100 14000Johnson 80 6200 6100 14000Ande 80 6400 6100 14000Lee 80 6800 6100 14000Tuvault 80 7000 6100 14000Sewall 80 7000 6100 14000Marvins 80 7200 6100 14000Bates 80 7300 6100 14000...LAST_VALUE 功能描述:返回組中數據窗口的最后一個值。
SAMPLE:下面例子計算按部門分區按薪水排序的數據窗口的最后一個值對應的名字,假如薪水的最后一個值有多個,則從多個對應的名字中取缺省排序的最后一個名字SELECT department_id, last_name, salary, LAST_VALUE(last_name) OVER(PARTITION BY department_id ORDER BY salary) AS highest_sal FROM employees WHERE department_id in(20,30);DEPARTMENT_ID LAST_NAME SALARY HIGHEST_SAL------------- ------------------------- ---------- ------------ 20 Fay 6000 Fay 20 Hartstein 13000 Hartstein 30 Colmenares 2500 Colmenares 30 Himuro 2600 Himuro 30 Tobias 2800 Tobias 30 Baida 2900 Baida 30 Khoo 3100 Khoo 30 Raphaely 11000 RaphaelyLEAD 功能描述:LEAD與LAG相反,LEAD可以訪問組中當前行之后的行。Offset是一個正整數,其默認值為1,若索引超出窗口的范圍,就返回默認值(默認返回的是組中第一行)SAMPLE:下面的例子中每行的"NextHired"返回按hire_date排序的下一行的hire_date值SELECT last_name, hire_date, LEAD(hire_date, 1) OVER (ORDER BY hire_date) AS "NextHired" FROM employees WHERE department_id = 30;LAST_NAME HIRE_DATE NextHired------------------------- --------- ---------Raphaely 07-DEC-94 18-MAY-95Khoo 18-MAY-95 24-JUL-97Tobias 24-JUL-97 24-DEC-97Baida 24-DEC-97 15-NOV-98Himuro 15-NOV-98 10-AUG-99Colmenares 10-AUG-99MAX 功能描述:在一個組中的數據窗口中查找表達式的最大值。
SAMPLE:下面例子中dept_max返回當前行所在部門的最大薪水值SELECT department_id, last_name, salary, MAX(salary) OVER (PARTITION BY department_id) AS dept_max FROM employees WHERE department_id in (10,20,30);DEPARTMENT_ID LAST_NAME SALARY DEPT_MAX------------- ------------------------- ---------- ---------- 10 Whalen 4400 4400 20 Hartstein 13000 13000 20 Fay 6000 13000 30 Raphaely 11000 11000 30 Khoo 3100 11000 30 Baida 2900 11000 30 Tobias 2800 11000 30 Himuro 2600 11000 30 Colmenares 2500 11000MIN 功能描述:在一個組中的數據窗口中查找表達式的最小值。SAMPLE:下面例子中dept_min返回當前行所在部門的最小薪水值SELECT department_id, last_name, salary, MIN(salary) OVER (PARTITION BY department_id) AS dept_min FROM employees WHERE department_id in (10,20,30);DEPARTMENT_ID LAST_NAME SALARY DEPT_MIN------------- ------------------------- ---------- ---------- 10 Whalen 4400 4400 20 Hartstein 13000 6000 20 Fay 6000 6000 30 Raphaely 11000 2500 30 Khoo 3100 2500 30 Baida 2900 2500 30 Tobias 2800 2500 30 Himuro 2600 2500 30 Colmenares 2500 2500NTILE 功能描述:將一個組分為"表達式"的散列表示,例如,假如表達式=4,則給組中的每一行分配一個數(從1到4),假如組中有20行,則給前5行分配1,給下5行分配2等等。
假如組的基數不能由表達式值平均分開,則對這些行進行分配時,組中就沒有任何percentile的行數比其它percentile的行數超過一行,最低的percentile是那些擁有額外行的percentile。例如,若表達式=4,行數=21,則percentile=1的有5行,percentile=2的有5行等等。SAMPLE:下例中把6行數據分為4份SELECT last_name, salary, NTILE(4) OVER (ORDER BY salary DESC) AS quartile FROM employeesWHERE department_id = 100;LAST_NAME SALARY QUARTILE------------------------- ---------- ----------Greenberg 12000 1Faviet 9000 1Chen 8200 2Urman 7800 2Sciarra 7700 3Popp 6900 4PERCENT_RANK 功能描述:和CUME_DIST(累積分配)函數類似,對于一個組中給定的行來說,在計算那行的序號時,先減1,然后除以n-1(n為組中所有的行數)。該函數總是返回0~1(包括1)之間的數。SAMPLE:下例中假如Khoo的salary為2900,則pr值為0.6,因為RANK函數對于等值的返回序列值是一樣的SELECT department_id, last_name, salary, PERCENT_RANK() OVER (PARTITION BY department_id ORDER BY salary) AS pr FROM employeesWHERE department_id < 50 ORDER BY department_id,salary;DEPARTMENT_ID LAST_NAME SALARY PR------------- ------------------------- ---------- ---------- 10 Whalen 4400 0 20 Fay 6000 0 20 Hartstein 13000 1 30 Colmenares 2500 0 30 Himuro 2600 0.2 30 Tobias 2800 0.4 30 Baida 2900 0.6 30 Khoo 3100 0.8 30 Raphaely 11000 1 40 Mavris 6500 0PERCENTILE_CONT 功能描述:返回一個與輸入的分布百分比值相對應的數據值,分布百分比的計算方法見函數PERCENT_RANK,假如沒有正好對應的數據值,就通過下面算法來得到值: RN = 1+ (P*(N-1)) 其中P是輸入的分布百分比值,N是組內的行數 CRN = CEIL(RN) FRN = FLOOR(RN)if (CRN = FRN = RN) then (value of expression from row at RN) else (CRN - RN) * (value of expression for row at FRN) + (RN - FRN) * (value of expression for row at CRN) 注重:本函數與PERCENTILE_DISC的區別在找不到對應的分布值時返回的替代值的計算方法不同SAMPLE:在下例中,對于部門60的Percentile_Cont值計算如下: P=0.7 N=5 RN =1+ (P*(N-1)=1+(0.7*(5-1))=3.8 CRN = CEIL(3.8)=4 FRN = FLOOR(3.8)=3 (4 - 3.8)* 4800 + (3.8 - 3) * 6000 = 5760SELECT last_name, salary, department_id, PERCENTILE_CONT(0.7) WITHIN GROUP (ORDER BY salary) OVER (PARTITION BY department_id) "Percentile_Cont", PERCENT_RANK() OVER (PARTITION BY department_id ORDER BY salary) "Percent_Rank" FROM employees WHERE department_id IN (30, 60);LAST_NAME SALARY DEPARTMENT_ID Percentile_Cont Percent_Rank------------------------- ---------- ------------- --------------- ------------Colmenares 2500 30 3000 0Himuro 2600 30 3000 0.2Tobias 2800 30 3000 0.4Baida 2900 30 3000 0.6Khoo 3100 30 3000 0.8Raphaely 11000 30 3000 1Lorentz 4200 60 5760 0Austin 4800 60 5760 0.25Pataballa 4800 60 5760 0.25Ernst 6000 60 5760 0.75Hunold 9000 60 5760 1PERCENTILE_DISC 功能描述:返回一個與輸入的分布百分比值相對應的數據值,分布百分比的計算方法見函數CUME_DIST,假如沒有正好對應的數據值,就取大于該分布值的下一個值。
注重:本函數與PERCENTILE_CONT的區別在找不到對應的分布值時返回的替代值的計算方法不同SAMPLE:下例中0.7的分布值在部門30中沒有對應的Cume_Dist值,所以就取下一個分布值0.83333333所對應的SALARY來替代SELECT last_name, salary, department_id, PERCENTILE_DISC(0.7) WITHIN GROUP (ORDER BY salary ) OVER (PARTITION BY department_id) "Percentile_Disc", CUME_DIST() OVER (PARTITION BY department_id ORDER BY salary) "Cume_Dist" FROM employees WHERE department_id in (30, 60);LAST_NAME SALARY DEPARTMENT_ID Percentile_Disc Cume_Dist------------------------- ---------- ------------- --------------- ----------Colmenares 2500 30 3100 .166666667Himuro 2600 30 3100 .333333333Tobias 2800 30 3100 .5Baida 2900 30 3100 .666666667Khoo 3100 30 3100 .833333333Raphaely 11000 30 3100 1Lorentz 4200 60 6000 .2Austin 4800 60 6000 .6Pataballa 4800 60 6000 .6Ernst 6000 60 6000 .8Hunold 9000 60 6000 1RANK 功能描述:根據ORDER BY子句中表達式的值,從查詢返回的每一行,計算它們與其它行的相對位置。
組內的數據按ORDER BY子句排序,然后給每一行賦一個號,從而形成一個序列,該序列從1開始,往后累加。每次ORDER BY表達式的值發生變化時,該序列也隨之增加。有同樣值的行得到同樣的數字序號(認為null時相等的)。然而,假如兩行的確得到同樣的排序,則序數將隨后跳躍。若兩行序數為1,則沒有序數2,序列將給組中的下一行分配值3,DENSE_RANK則沒有任何跳躍。SAMPLE:下例中計算每個員工按部門分區再按薪水排序,依次出現的序列號(注重與DENSE_RANK函數的區別)SELECT d.department_id , e.last_name, e.salary, RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary) as drank FROM employees e, departments dWHERE e.department_id = d.department_id AND d.department_id IN ('60', '90');DEPARTMENT_ID LAST_NAME SALARY DRANK------------- ------------------------- ---------- ---------- 60 Lorentz 4200 1 60 Austin 4800 2 60 Pataballa 4800 2 60 Ernst 6000 4 60 Hunold 9000 5 90 Kochhar 17000 1 90 De Haan 17000 1 90 King 24000 3RATIO_TO_REPORT 功能描述:該函數計算expression/(sum(expression))的值,它給出相對于總數的百分比,即當前行對sum(expression)的貢獻。SAMPLE:下例計算每個員工的工資占該類員工總工資的百分比SELECT last_name, salary, RATIO_TO_REPORT(salary) OVER () AS rr FROM employeesWHERE job_id = 'PU_CLERK';LAST_NAME SALARY RR------------------------- ---------- ----------Khoo 3100 .223021583Baida 2900 .208633094Tobias 2800 .201438849Himuro 2600 .18705036Colmenares 2500 .179856115REGR_ (Linear Regression) Functions 功能描述:這些線性回歸函數適合最小二乘法回歸線,有9個不同的回歸函數可使用。
REGR_SLOPE:返回斜率,等于COVAR_POP(expr1, expr2) / VAR_POP(expr2) REGR_INTERCEPT:返回回歸線的y截距,等于 AVG(expr1) - REGR_SLOPE(expr1, expr2) * AVG(expr2) REGR_COUNT:返回用于填充回歸線的非空數字對的數目 REGR_R2:返回回歸線的決定系數,計算式為: If VAR_POP(expr2) = 0 then return NULL If VAR_POP(expr1) = 0 and VAR_POP(expr2) != 0 then return 1 If VAR_POP(expr1) > 0 and VAR_POP(expr2 != 0 then return POWER(CORR(expr1,expr),2) REGR_AVGX:計算回歸線的自變量(expr2)的平均值,去掉了空對(expr1, expr2)后,等于AVG(expr2) REGR_AVGY:計算回歸線的應變量(expr1)的平均值,去掉了空對(expr1, expr2)后,等于AVG(expr1) REGR_SXX: 返回值等于REGR_COUNT(expr1, expr2) * VAR_POP(expr2) REGR_SYY: 返回值等于REGR_COUNT(expr1, expr2) * VAR_POP(expr1) REGR_SXY: 返回值等于REGR_COUNT(expr1, expr2) * COVAR_POP(expr1, expr2)(下面的例子都是在SH用戶下完成的)SAMPLE 1:下例計算1998年最后三個星期中兩種產品(260和270)在周末的銷售量中已開發票數量和總數量的累積斜率和回歸線的截距SELECT t.fiscal_month_number "Month", t.day_number_in_month "Day", REGR_SLOPE(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) AS CUM_SLOPE, REGR_INTERCEPT(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) AS CUM_ICPT FROM sales s, times tWHERE s.time_id = t.time_id AND s.prod_id IN (270, 260) AND t.fiscal_year=1998 AND t.fiscal_week_number IN (50, 51, 52) AND t.day_number_in_week IN (6,7) ORDER BY t.fiscal_month_desc, t.day_number_in_month; Month Day CUM_SLOPE CUM_ICPT---------- ---------- ---------- ---------- 12 12 -68 1872 12 12 -68 1872 12 13 -20.244898 1254.36735 12 13 -20.244898 1254.36735 12 19 -18.826087 1287 12 20 62.4561404 125.28655 12 20 62.4561404 125.28655 12 20 62.4561404 125.28655 12 20 62.4561404 125.28655 12 26 67.2658228 58.9712313 12 26 67.2658228 58.9712313 12 27 37.5245541 284.958221 12 27 37.5245541 284.958221 12 27 37.5245541 284.958221SAMPLE 2:下例計算1998年4月天天的累積交易數量SELECT UNIQUE t.day_number_in_month, REGR_COUNT(s.amount_sold, s.quantity_sold) OVER (PARTITION BY t.fiscal_month_number ORDER BY t.day_number_in_month) "Regr_Count"FROM sales s, times tWHERE s.time_id = t.time_id AND t.fiscal_year = 1998 AND t.fiscal_month_number = 4;DAY_NUMBER_IN_MONTH Regr_Count------------------- ---------- 1 825 2 1650 3 2475 4 3300... 26 21450 30 22200SAMPLE 3:下例計算1998年每月銷售量中已開發票數量和總數量的累積回歸線決定系數SELECT t.fiscal_month_number, REGR_R2(SUM(s.amount_sold), SUM(s.quantity_sold)) OVER (ORDER BY t.fiscal_month_number) "Regr_R2" FROM sales s, times t WHERE s.time_id = t.time_id AND t.fiscal_year = 1998 GROUP BY t.fiscal_month_number ORDER BY t.fiscal_month_number;FISCAL_MONTH_NUMBER Regr_R2------------------- ---------- 1 2 1 3 .927372984 4 .807019972 5 .932745567 6 .94682861 7 .965342011 8 .955768075 9 .959542618 10 .938618575 11 .880931415 12 .882769189SAMPLE 4:下例計算1998年12月最后兩周產品260的銷售量中已開發票數量和總數量的累積平均值SELECT t.day_number_in_month, REGR_AVGY(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) "Regr_AvgY", REGR_AVGX(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) "Regr_AvgX" FROM sales s, times t WHERE s.time_id = t.time_id AND s.prod_id = 260 AND t.fiscal_month_desc = '1998-12' AND t.fiscal_week_number IN (51, 52) ORDER BY t.day_number_in_month;DAY_NUMBER_IN_MONTH Regr_AvgY Regr_AvgX------------------- ---------- ---------- 14 882 24.5 14 882 24.5 15 801 22.25 15 801 22.25 16 777.6 21.6 18 642.857143 17.8571429 18 642.857143 17.8571429 20 589.5 16.375 21 544 15.1111111 22 592.363636 16.4545455 22 592.363636 16.4545455 24 553.846154 15.3846154 24 553.846154 15.3846154 26 522 14.5 27 578.4 16.0666667SAMPLE 5:下例計算產品260和270在1998年2月周末銷售量中已開發票數量和總數量的累積REGR_SXY, REGR_SXX, and REGR_SYY統計值SELECT t.day_number_in_month, REGR_SXY(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_sxy", REGR_SYY(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_syy", REGR_SXX(s.amount_sold, s.quantity_sold) OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_sxx"FROM sales s, times tWHERE s.time_id = t.time_id AND prod_id IN (270, 260) AND t.fiscal_month_desc = '1998-02' AND t.day_number_in_week IN (6,7)ORDER BY t.day_number_in_month;DAY_NUMBER_IN_MONTH Regr_sxy Regr_syy Regr_sxx------------------- ---------- ---------- ---------- 1 18870.4 2116198.4 258.4 1 18870.4 2116198.4 258.4 1 18870.4 2116198.4 258.4 1 18870.4 2116198.4 258.4 7 18870.4 2116198.4 258.4 8 18870.4 2116198.4 258.4 14 18870.4 2116198.4 258.4 15 18870.4 2116198.4 258.4 21 18870.4 2116198.4 258.4 22 18870.4 2116198.4 258.4ROW_NUMBER 功能描述:返回有序組中一行的偏移量,從而可用于按特定標準排序的行號。
SAMPLE:下例返回每個員工再在每個部門中按員工號排序后的順序號SELECT department_id, last_name, employee_id, ROW_NUMBER() OVER (PARTITION BY department_id ORDER BY employee_id) AS emp_id FROM employeesWHERE department_id < 50;DEPARTMENT_ID LAST_NAME EMPLOYEE_ID EMP_ID------------- ------------------------- ----------- ---------- 10 Whalen 200 1 20 Hartstein 201 1 20 Fay 202 2 30 Raphaely 114 1 30 Khoo 115 2 30 Baida 116 3 30 Tobias 117 4 30 Himuro 118 5 30 Colmenares 119 6 40 Mavris 203 1STDDEV 功能描述:計算當前行關于組的標準偏離。(Standard Deviation)SAMPLE:下例返回部門30按雇傭日期排序的薪水值的累積標準偏離SELECT last_name, hire_date,salary, STDDEV(salary) OVER (ORDER BY hire_date) "StdDev" FROM employees WHERE department_id = 30;LAST_NAME HIRE_DATE SALARY StdDev------------------------- ---------- ---------- ----------Raphaely 07-12月-94 11000 0Khoo 18-5月 -95 3100 5586.14357Tobias 24-7月 -97 2800 4650.0896Baida 24-12月-97 2900 4035.26125Himuro 15-11月-98 2600 3649.2465Colmenares 10-8月 -99 2500 3362.58829STDDEV_POP 功能描述:該函數計算總體標準偏離,并返回總體變量的平方根,其返回值與VAR_POP函數的平方根相同。
(Standard Deviation-Population)SAMPLE:下例返回部門20、30、60的薪水值的總體標準偏差SELECT department_id, last_name, salary, STDDEV_POP(salary) OVER (PARTITION BY department_id) AS pop_std FROM employeesWHERE department_id in (20,30,60);DEPARTMENT_ID LAST_NAME SALARY POP_STD------------- ------------------------- ---------- ---------- 20 Hartstein 13000 3500 20 Fay 6000 3500 30 Raphaely 11000 3069.6091 30 Khoo 3100 3069.6091 30 Baida 2900 3069.6091 30 Colmenares 2500 3069.6091 30 Himuro 2600 3069.6091 30 Tobias 2800 3069.6091 60 Hunold 9000 1722.32401 60 Ernst 6000 1722.32401 60 Austin 4800 1722.32401 60 Pataballa 4800 1722.32401 60 Lorentz 4200 1722.32401STDDEV_SAMP 功能描述: 該函數計算累積樣本標準偏離,并返回總體變量的平方根,其返回值與VAR_POP函數的平方根相同。
(Standard Deviation-Sample)SAMPLE:下例返回部門20、30、60的薪水值的樣本標準偏差SELECT department_id, last_name, hire_date, salary, STDDEV_SAMP(salary) OVER (PARTITION BY department_id ORDER BY hire_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cum_sdev FROM employeesWHERE department_id in (20,30,60);DEPARTMENT_ID LAST_NAME HIRE_DATE SALARY CUM_SDEV------------- ------------------------- ---------- ---------- ---------- 20 Hartstein 17-2月 -96 13000 20 Fay 17-8月 -97 6000 4949.74747 30 Raphaely 07-12月-94 11000 30 Khoo 18-5月 -95 3100 5586.14357 30 Tobias 24-7月 -97 2800 4650.0896 30 Baida 24-12月-97 2900 4035.26125 30 Himuro 15-11月-98 2600 3649.2465 30 Colmenares 10-8月 -99 2500 3362.58829 60 Hunold 03-1月 -90 9000 60 Ernst 21-5月 -91 6000 2121.32034 60 Austin 25-6月 -97 4800 2163.33077 60 Pataballa 05-2月 -98 4800 1982.42276 60 Lorentz 07-2月 -99 4200 1925.61678SUM 功能描述:該函數計算組中表達式的累積和。
SAMPLE:下例計算同一經理下員工的薪水累積值SELECT manager_id, last_name, salary, SUM (salary) OVER (PARTITION BY manager_id ORDER BY salary RANGE UNBOUNDED PRECEDING) l_csum FROM employees WHERE manager_id in (101,103,108);MANAGER_ID LAST_NAME SALARY L_CSUM---------- ------------------------- ---------- ---------- 101 Whalen 4400 4400 101 Mavris 6500 10900 101 Baer 10000 20900 101 Greenberg 12000 44900 101 Higgins 12000 44900 103 Lorentz 4200 4200 103 Austin 4800 13800 103 Pataballa 4800 13800 103 Ernst 6000 19800 108 Popp 6900 6900 108 Sciarra 7700 14600 108 Urman 7800 22400 108 Chen 8200 30600 108 Faviet
9000 39600 VAR_POP功能描述:(Variance Population)該函數返回非空集合的總體變量(忽略null),VAR_POP進行如下計算: (SUM(expr2) - SUM(expr)2 / COUNT(expr)) / COUNT(expr)SAMPLE:下例計算1998年每月銷售的累積總體和樣本變量(本例在SH用戶下運行)SELECT t.calendar_month_desc, VAR_POP(SUM(s.amount_sold)) OVER (ORDER BY t.calendar_month_desc) "Var_Pop", VAR_SAMP(SUM(s.amount_sold)) OVER (ORDER BY t.calendar_month_desc) "Var_Samp" FROM sales s, times tWHERE s.time_id = t.time_id AND t.calendar_year = 1998GROUP BY t.calendar_month_desc;CALENDAR Var_Pop Var_Samp-------- ---------- ----------1998-01 01998-02 6.1321E+11 1.2264E+121998-03 4.7058E+11 7.0587E+111998-04 4.6929E+11 6.2572E+111998-05 1.5524E+12 1.9405E+121998-06 2.3711E+12 2.8453E+121998-07 3.7464E+12 4.3708E+121998-08 3.7852E+12 4.3260E+121998-09 3.5753E+12 4.0222E+121998-10 3.4343E+12 3.8159E+121998-11 3.4245E+12 3.7669E+121998-12 4.8937E+12 5.3386E+12VAR_SAMP 功能描述:(Variance Sample)該函數返回非空集合的樣本變量(忽略null),VAR_POP進行如下計算: (SUM(expr*expr)-SUM(expr)*SUM(expr)/COUNT(expr))/(COUNT(expr)-1)SAMPLE:下例計算1998年每月銷售的累積總體和樣本變量SELECT t.calendar_month_desc, VAR_POP(SUM(s.amount_sold)) OVER (ORDER BY t.calendar_month_desc) "Var_Pop", VAR_SAMP(SUM(s.amount_sold)) OVER (ORDER BY t.calendar_month_desc) "Var_Samp" FROM sales s, times tWHERE s.time_id = t.time_id AND t.calendar_year = 1998GROUP BY t.calendar_month_desc;CALENDAR Var_Pop Var_Samp-------- ---------- ----------1998-01 01998-02 6.1321E+11 1.2264E+121998-03 4.7058E+11 7.0587E+111998-04 4.6929E+11 6.2572E+111998-05 1.5524E+12 1.9405E+121998-06 2.3711E+12 2.8453E+121998-07 3.7464E+12 4.3708E+121998-08 3.7852E+12 4.3260E+121998-09 3.5753E+12 4.0222E+121998-10 3.4343E+12 3.8159E+121998-11 3.4245E+12 3.7669E+121998-12 4.8937E+12 5.3386E+12VARIANCE 功能描述:該函數返回表達式的變量,Oracle計算該變量如下: 假如表達式中行數為1,則返回0 假如表達式中行數大于1,則返回VAR_SAMPSAMPLE:下例返回部門30按雇傭日期排序的薪水值的累積變化SELECT last_name, salary, VARIANCE(salary) OVER (ORDER BY hire_date) "Variance" FROM employees WHERE department_id = 30;LAST_NAME
SALARY Variance------------------------- ---------- ----------Raphaely 11000 0Khoo 3100 31205000Tobias 2800 21623333.3Baida 2900 16283333.3Himuro 2600 13317000Colmenares 2500 11307000