== Notes on Oracle Query Performance - Scalar Subqueries v. Inline Views == The notes below supplement my answer to a Stack Overflow question here: http://stackoverflow.com/questions/4265213/how-to-turn-2-queries-with-common-columns-a-b-and-a-c-into-just-one-a-b/4265246#4265246 In Oracle, a query like mine is **usually** more efficient than PerformanceDBA's solution using scalar subqueries, due to the way Oracle processes scalar subqueries. As a test I set up three tables corresponding to Producers, AnimalsBought and AnimalsExploration and tested both approaches with a reasonable amount of data - 1000 Producers, with 2 million AnimalsBought rows and 2 million AnimalsExploration rows, both evenly spread among the 1000 Producers. My tables are called T1, T2 and T3 with T1 playing the role of Producers. They are defined as follows: -- 1000 rows create table t1 (id integer not null primary key, padding varchar2(200) not null); -- 2000000 rows create table t2 (id integer not null primary key, if_t2 integer not null padding varchar2(200) not null); create index t2_t1_id on t2 (id_t1); -- 2000000 rows create table t3 (id integer not null primary key, if_t2 integer not null padding varchar2(200) not null); create index t3_t1_id on t3 (id_t1); I ran my tests on Oracle Database 10g Enterprise Edition Release 10.2.0.3.0, running on a Windows server (I don't have more specific details about the Windows server, but it is just a small box used for a small development project). Now it turns out (embarrassingly contrary to a comment I made earlier when I had made a mistake in my tests) that PerformanceDBA's query **is** quicker in this particular example where we are simply **counting** child rows: 1) Scalar subquery, performing count: select t1.id, (select count(id) from t2 where t2.id_t1 = t1.id) t2_cnt, (select count(id) from t3 where t3.id_t1 = t1.id) t3_cnt from t1; Timings from 5 runs (seconds elapsed): 3.60 3.50 3.69 3.39 3.43 Execution plan: SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1000 Bytes=3000) SORT (AGGREGATE) INDEX (RANGE SCAN) OF 'T2_T1_ID' (INDEX) (Cost=7 Card=1998 Bytes=5994) SORT (AGGREGATE) INDEX (RANGE SCAN) OF 'T3_T1_ID' (INDEX) (Cost=7 Card=1998 Bytes=5994) INDEX (FAST FULL SCAN) OF 'SYS_C0027014' (INDEX (UNIQUE)) (Cost=2 Card=1000 Bytes=3000) 2) Inline views with outer joins, performing count: select t1.id, v2.cnt, v3.cnt from t1 left outer join (select id_t1, count(id) cnt from t2 group by id_t1) v2 on v2.id_t1 = t1.id left outer join (select id_t1, count(id) cnt from t3 group by id_t1) v3 on v3.id_t1 = t1.id; Timings from 5 runs (seconds elapsed): 5.26 5.30 6.14 5.59 6.10 Execution plan: SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2303 Card=1000 Bytes=55000) HASH JOIN (RIGHT OUTER) (Cost=2303 Card=1000 Bytes=55000) VIEW (Cost=1150 Card=1001 Bytes=26026) HASH (GROUP BY) (Cost=1150 Card=1001 Bytes=3003) INDEX (FAST FULL SCAN) OF 'T3_T1_ID' (INDEX) (Cost=940 Card=2000000 Bytes=6000000) HASH JOIN (OUTER) (Cost=1153 Card=1000 Bytes=29000) INDEX (FAST FULL SCAN) OF 'SYS_C0027014' (INDEX (UNIQUE)) (Cost=2 Card=1000 Bytes=3000) VIEW (Cost=1150 Card=1001 Bytes=26026) HASH (GROUP BY) (Cost=1150 Card=1001 Bytes=3003) INDEX (FAST FULL SCAN) OF 'T2_T1_ID' (INDEX) (Cost=940 Card=2000000 Bytes=6000000) This is because the counts can be performed without visiting the actual tables T2 and T3, just using the indexes I created on T2(ID_T1) and T3(ID_T1). However, if instead of just a count we had wanted some information that required accessing the child tables, then the performance is very different. Both take considerably longer, but the inline views outperform the scalar subqueries dramatically: 1) Scalar subquery, performing SUM: select t1.id, (select sum(id) from t2 where t2.id_t1 = t1.id) t2_cnt, (select sum(id) from t3 where t3.id_t1 = t1.id) t3_cnt from t1; Timings from 1 run: Still running after 2 hours... I will update this when I have the answer. Execution plan: (To be posted soon). 2) Inline views with outer joins, performing SUM: select t1.id, v2.cnt, v3.cnt from t1 left outer join (select id_t1, sum(id) cnt from t2 group by id_t1) v2 on v2.id_t1 = t1.id left outer join (select id_t1, sum(id) cnt from t3 group by id_t1) v3 on v3.id_t1 = t1.id; Timings from 2 runs: 1min 17sec 1min 19sec Execution plan: SELECT STATEMENT Optimizer=ALL_ROWS (Cost=27264 Card=1000 Bytes=55000) HASH JOIN (RIGHT OUTER) (Cost=27264 Card=1000 Bytes=55000) VIEW (Cost=13631 Card=1001 Bytes=26026) HASH (GROUP BY) (Cost=13631 Card=1001 Bytes=8008) TABLE ACCESS (FULL) OF 'T3' (TABLE) (Cost=13421 Card=2000000 Bytes=16000000) HASH JOIN (OUTER) (Cost=13633 Card=1000 Bytes=29000) INDEX (FAST FULL SCAN) OF 'SYS_C0027014' (INDEX (UNIQUE)) (Cost=2 Card=1000 Bytes=3000) VIEW (Cost=13630 Card=1001 Bytes=26026) HASH (GROUP BY) (Cost=13630 Card=1001 Bytes=8008) TABLE ACCESS (FULL) OF 'T2' (TABLE) (Cost=13420 Card=2000000 Bytes=16000000) In this example the cost of computing each scalar subquery is significant and so computing each one 1000 times is very slow. **Conclusion** I must retract my claim that my query is faster than PerformanceDBAs for the particular query that is the subject of the original question. However, I still claim that inline views perform better than scalar subqueries **in Oracle** in many (perhaps most) cases.