Friday, August 23, 2019

SQL Brainf**k, not for faint hearted!

This is a challenge. I'll start with some background of the problem, challenge query and the explanation.

If you know already know the answer - good! If not and want to explain it yourself, don't browse far down.

Have fun!

Ok, let's go.

I have to admit, at first I thought it's a bug in SQL Server. I was doing some data dictionary work in SQL Server and came up with simple query to check database log file size:
select size
from sys.master_files 
where type=1 
and database_id=(select database_id from sys.databases where name='master') 
and name='mastlog'
It'll give you one row (as long as you provide consistent conditions, the example should work everywhere).


but I made a mistake, a typo... (not possible to make with Oracle :P) I left out from statement from the subquery resulting in:
select size
from sys.master_files 
where type=1 
and database_id=(select database_id where name='master') 
and name='mastlog'
and oh boy, how surprised it was for me not to see any rows! I checked conditions - both database and file names were valid and consistent! I should have gotten the single value! Then I realised my mistake and immidiately thought - how stupid SQL Server is... again! That would have never happened in Oracle since it's enforcing the use of the from statement. I would've used the table name there. So, I thought, another good topic for the blog :D So I started building demo query that I could run also in Oracle... I've almost lost concience when it worked exactly the same!

So the challenge is - explain why it works as it does and if it's ok or not.

The query is built using with syntax. There are two datasets defined in with section.
One is a dataset (a combination of two columns, 9 rows total):
select * from
( 
 select 'A' as k union all 
 select 'B' as k union all 
 select 'C' as k
) k
cross join
( 
 select 1 as v union all 
 select 2 as v union all 
 select 3 as v
) v
K          V
- ----------
A          1
A          2
A          3
B          1
B          2
B          3
C          1
C          2
C          3


The other one is a list to be used for filtering the above mentioned dataset:
select 'A' as k, 0 as v union all 
select 'B' as k, 1 as v union all 
select 'C' as k, 0 as v
K          V
- ----------
A          0
B          1
C          0


We want to filter the dataset on column k with rows from list where v equals 1, so our working query is:
-- SQL Server:
with 
 dataset as (
  select * from
  (
   select 'A' as k union all 
   select 'B' as k union all 
   select 'C' as k) k
  cross join
  (
   select 1 as v union all 
   select 2 as v union all 
   select 3 as v) v
 ),
 list as (
  select 'A' as k, 0 as v union all 
  select 'B' as k, 1 as v union all 
  select 'C' as k, 0 as v
 )
select * 
from dataset 
where k = (select k from list where v=1 )
go

-- Oracle
with 
 dataset as (
  select * from
  (
            select 'A' as k from dual union all 
            select 'B' as k from dual union all 
            select 'C' as k from dual) k
  cross join
  (
            select 1 as v from dual union all 
            select 2 as v from dual union all 
            select 3 as v from dual) v
 ),
 list as (
  select 'A' as k, 0 as v from dual union all 
        select 'B' as k, 1 as v from dual union all 
        select 'C' as k, 0 as v from dual
 )
select * 
from dataset 
where k = (select k from list where v=1 );

...and, as expected, we're getting this as an output:
K          V
- ----------
B          1
B          2
B          3


The question is, why the crippled query (same as above but with removed from list part in SQL Server or replaced list with dual in case of Oracle) returns this:
-- SQL Server:
with 
 (...)
select * 
from dataset 
where k = (select k where v=1 )
go

-- Oracle
with 
 (...)
select * 
from dataset 
where k = (select k from dual where v=1 );
K          V
- ----------
A          1
B          1
C          1


Please also note, that the subquery, when executed independently, fails!

? ? ? ? ? ? ? ? ? ?





Best


No comments:

Post a Comment

Non fractional division (int/float)

For me a RDBMS is a high level tool. Something very opposite to using low level languages like C, C++ or... Assembler. From user/developer p...