In PL/SQL, I want to calculate the Shapiro-Wilk value of my selected data.
Apparently, the function DBMS_STAT_FUNCS.normal_dist_fit is able to do it, but it takes a table name as a parameter, and not the result of a select.
The values I am working with are numbers, selected after many inner joins. Here is a sample of my values for a typical test (each value are only one number, I only regrouped them here for readability) :
-1,1168954372406006 -1,0339429378509521 -1,0194162130355835 -,9636680483818054 -,9536418914794922 -,9111286401748657 -,851311981678009 -,8325300216674805 -,8051750063896179 -,738100528717041 -,7174761295318604 -,6651638746261597 -,663613498210907 -,6444216966629028 -,6267942190170288 -,6180349588394165 -,6027824282646179 -,5999149084091187 -,5908389687538147 -,590206503868103 -,5845686793327332 -,5831132531166077 -,5629676580429077 -,5572993159294128 -,5481508374214172 -,5445670485496521 -,5093156099319458 -,49224603176116943 -,47992199659347534 -,4758097231388092 -,4546264410018921 -,38799363374710083 -,3803306519985199 -,3734436631202698 -,36905646324157715 -,27978914976119995 -,2048187553882599 -,16521787643432617
Advertisement
Answer
You could create a view based on your query. (“View” just means a named query, whose code is saved for future use; a view does not contain any data. So even if you would rather not create tables on a production server, it may be OK to create views.) Then you can use this view in a call to the goodness-of-fit procedure.
First let’s clarify that DBMS_STAT_FUNCS.normal_dist_fit
is not a function, as might be suggested by the FUNCS in the package name (and as you say in your post), but a procedure. In fact, all the subprograms in that package are procedures!
The documentation for the package is wrong.
https://docs.oracle.com/database/121/ARPLS/d_stat_f.htm#ARPLS68476
It says that mean
and stddev
are IN parameters. They are not; they are OUT parameters. It is not really clear why they are needed in this procedure (there are separate functions to compute them), but again, that is not my choice, it’s Oracle’s.
Here is a brief example of using the normal_dist_fit
procedure on a physical table (from a different schema: the standard HR
schema). Note that I must have the select any table
privilege, or some other privilege that allows me to select from tables in the HR
schema. (I am not logged in as HR
, but as another user, MATHGUY
.)
declare mn number; sd number; sw number; begin dbms_stat_funcs.normal_dist_fit('HR', 'EMPLOYEES', 'SALARY', 'SHAPIRO_WILKS', mn, sd, sw); end; / W value : .8739562109117848523112862359519603805821 PL/SQL procedure successfully completed.
(Note that the argument for the Shapiro-Wilk test is ‘SHAPIRO_WILKS’, with an S at the end; perhaps Oracle had a good reason for this, although I doubt it.)
Now, let’s say in my schema I have this query, which produces a column of numbers:
select salary * (1 + nvl(commission_pct, 0)) as total_compensation from hr.employees ;
This computes total compensation, including commission (for those employees who have one; the NVL
call is needed because for employees without a commission, the percentage is shown as NULL rather than zero).
I will create a view based on this query… but I will need to have the needed privileges, and there is just a slight complication.
So far I assumed I can select from a table in the HR
schema; indeed, I have the SELECT ANY TABLE
privilege. But, in my case, I have that privilege through a role. That’s OK for selecting, but not OK for creating a view based on that SELECT
. To create a view, I must have SELECT ANY TABLE
(or a much weaker grant: to select specifically from that one table, or perhaps restricted even further to just the two columns I need) granted to me directly, not through a role. This is something you need to discuss with your DBA; it’s a privilege you already have (otherwise your existing SELECT
wouldn’t work), but if you have it through a role, now you will need to have it granted directly to you. And, of course, you must have the CREATE VIEW
privilege.
OK, so suppose you have the right privileges. Then you can do this:
Create the view
create or replace view my_emp(total_comp) as select salary*(1 + nvl(commission_pct, 0)) from hr.employees;
Invoke the procedure
declare mn number; sd number; sw number; begin dbms_stat_funcs.normal_dist_fit('MATHGUY', 'MY_EMP', 'TOTAL_COMP', 'SHAPIRO_WILKS', mn, sd, sw); end; / W value : .8852586932906502861798487994791857389177