I just make some queries for select data from my server. The query is:
SELECT ROUND((SUM(clength)/1048576),2) AS logical_MB, ROUND((SUM(plength) /1048576),2) AS physical_compr_MB, ds_doc.archiveno, ds_arch.archiveid FROM ECR.ds_comp, ECR.ds_doc, ECR.ds_arch WHERE ds_comp.docidno=ds_doc.docidno AND ds_doc.archiveno =ds_arch.archiveno GROUP BY ds_doc.archiveno, ds_arch.archiveid; result what is expecting is : 9708,24 9704,93 9 Vee3 0,009255342 13140,55 12682,93 10 Vf5 0,012095385 104533,94 89183,02 3 Mdf4 0,085051556 72346,34 48290,63 7 Sds2 0,046053534
But this query almost take one day. Any idea for optimize this query please?
Advertisement
Answer
You provide close to no information that is required to help with performance problem, so only a general checklist can be provided
Check the Query
The query does not qualify the columns clength
and plength
so please check if they are defined in the table ds_comp
– if not, maybe you do not need to join to this table at all…
Also I assume that docidno
is a primary key of ds_doc
and archiveno
is PK of ds_arch
. If not you query will work, but you will get a different result as you expect due to duplication caused by the join (this may also cause excesive elapsed time)!
Verify the Execution Plan
Produce the execution plan for your query in text form (to be able to post it) as follows
EXPLAIN PLAN SET STATEMENT_ID = '<sometag>' into plan_table FOR ... your query here ... SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', '<sometag>','ALL'));
Remember that you are joining complete tables (not only few rows for some ID
), so if you see INDEX ACCESS
or NESTED LOOP
there is a problem that explains the long runtime.
You want to see only HASH JOIN
and FULL TABLE SCAN
in your plan.
Index Access
Contrary to some recommendations in other answers if you want to profit from Index definition you do not need indexes on join columns (as explained above). What you can do is to cover all required attributes in indexes and perform the query using only indexes and ommit the table access at all. This will help if the tables are bright, i.e. the row size is large.
This definition will be needed
create index ds_comp_idx1 on ds_comp (docidno,clength,plength); create index ds_doc_idx1 on ds_doc (docidno,archiveno); create index ds_arch_idx1 on ds_arch (archiveno,archiveid);
and you will receive this plan
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1119K| 97M| 908 (11)| 00:00:01 | | 1 | HASH GROUP BY | | 1119K| 97M| 908 (11)| 00:00:01 | |* 2 | HASH JOIN | | 1119K| 97M| 831 (3)| 00:00:01 | |* 3 | HASH JOIN | | 1001 | 52052 | 5 (0)| 00:00:01 | | 4 | INDEX FULL SCAN | DS_ARCH_IDX1 | 11 | 286 | 1 (0)| 00:00:01 | | 5 | INDEX FAST FULL SCAN| DS_DOC_IDX1 | 1001 | 26026 | 4 (0)| 00:00:01 | | 6 | INDEX FAST FULL SCAN | DS_COMP_IDX1 | 1119K| 41M| 818 (2)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("C"."DOCIDNO"="D"."DOCIDNO") 3 - access("D"."ARCHIVENO"="A"."ARCHIVENO")
Note the INDEX FULL SCAN
and INDEX FAST FULL SCAN
which means you are scanning the data from the index only and you do not need to perform the full table scan.
Use Parallel Option
With your rather simple query there is not much option to improve something. What works always is to deploy a parallel query using the /*+ PARALLEL(N) */
hint.
The precontition is that your database is configured for this option and you have hardware that can deploy it.