SQL Workshop - Subselects And Join

Written by Nikos Vaggalis

Thursday, 21 November 2013

SQL Workshop is where we confront the type of problems SQL devs face on a day to day basis. The example here is based on a real problem encountered in Ingres - but it has been simplified for the purpose of illustration. What do you do when you can't use a subselect within a Join?

SQLW

Sometimes you know how to do something but the database you are working with just wont co-operate. You can give up or exercise the brain and find a different way of doing the job.

From time to time , there are certain SQL queries that are not supported by the DBMS’s current version, with support being promised TBA in future versions, or in worst case, never planning for it

But we really do need to make it work now, thus we must discover other ways of going about it.

The Problem

As an example, let’s assume that there is a fictional reseller company that uses the tables bought_master to record invoices issued for the products purchased, bought_detail to keep track of their corresponding details and sold_detail to record invoice details issued for the products sold. To keep the example code short we won't use the corresponding sold_master table.

bought_master
year	invoice	status
1/1/2013	1	paid
1/1/2013	2	paid
1/1/2013	3	paid
1/1/2013	4	paid
1/1/2013	5	paid

bought_detail
year	invoice	product_code	price
1/1/2013	1	1001002	$25
1/1/2013	1	23001002	$20
1/1/2013	2	1001002	$35
1/1/2013	3	23001002	$40
1/1/2013	4	23001002	$60

sold_detail
year	invoice	product_code	price
1/1/2013	300	1001002	$10
1/1/2013	301	1001002	$40

For a given product there must be entries in the bought table but not necessarily in sold; for example item 23001002 has 3 entries in the bought tables but none in sold, which in essence means that although the item has been bought, it has not generated a sale yet.

On the contrary, item 1001002 has entries in both tables which means that it has been bought and subsequently sold.

For comparison/statistical reasons we are asked to retrieve the average buying price per product, together with its latest selling price.

The desired result set should look like this:

desired result set
year	product code	latest price sold	average price bought
1/1/2013	1001002	$40	$30
1/1/2013	23001002	NULL	$60

How do we reach to those results ?

For product 1001002, bought_detail has two invoices: 1 with price $25 and 2 with price bought $35, so the average is (25+35)/2 =$30, while its latest price sold retrieved from invoice 301 of the corresponding entry in the sold_detail, is $40.

Product 23001002 has invoice entries 1,3,4 with average value ($20+$40+$60)/3 = $60, while there is no entry in the sold table since the product has not generated a sale yet.

Our first attempt at tackling it would be :

Select d.product_code as "product code" s.price as "latest price sold" avg(d.price) as "average price bought"from bought_detail d inner join bought_master h on d.year = h.year and d.invoice = h.invoice left join sold_detail s on d.product_code = s.product_code and d.last_year = s.last_year and s.invoice >= ALL (select s1.invoice from sold_detail s1 where s.year = s1.last_year and s.product_code = s1.product_code)where h.status="paid" and h.year='1/1/2013'group by d.product_code, s.price

Which is translated into English as :

For each item that has been bought and its invoice has been paid, get its average bought price and, where applicable, its latest sold price. Thus for each item we need all rows from the bought tables that have a status of paid, plus the row from the sold table that keeps the latest sold price. However, we also want to keep all entries from the bought table that do not have a corresponding entry in the sold table, therefore we'll use use a left join to satisfy this requirement.

To get to the latest price sold we use the ALL operator and a correlated sub-query which will do a one by one comparison of each product’s row in the sold table with the rest of the rows of the same product in that same table (that is, doing a self join on the sold table), to get the price based on the latest invoice number ( which is produced in a sequential fashion). Alternatively the same would be achievable with the use of the max operator.

However, running the query (on Ingres v9.2) fails with:

E_PS03A0 Subselects are not supported inside join qualifications.

The same stands still true, when replacing ALL with the max operator:

left join sold_detail s on d.product_code = s.product_code and d.last_year = s.last_year and s.invoice = (select max(s1.invoice) from invoice_sold_detail s1 where s.year = s1.last_year and s.product_code = s1.product_code)

So it seems that we've hit a technical issue.

Another Way

But there should be a way of bypassing this limitation, and there is, as long as we restructure the query into :

Select d.product_code as "product code" s.price as "latest price sold" avg(d.price) as "average price bought"from bought_detail d inner join bought_master h on d.year = h.year and d.invoice = h.invoice left join sold_detail s on d.product_code = s.product_code and d.year=s.yearwhere h.status="paid" and h.year='1/1/2013' and (s.invoice = (select max(s1.invoice) from sold_detail s1 where s.product_code = s1.product_code and s1.year = s1.year ) or s.invoice IS NULL)group by d.product_code, s.price

What happens is that we shift the aggregate test into the WHERE filtering clause, and add the or s.invoice IS NULL condition which takes advantage of the NULL values returned by the left join whenever there is no applicable invoice entry in the sold table, thus keeping all entries of the bought table together with the latest sold price wherever found

This goes to show that there are times that we must resort into makeshift solutions to overcome technical limitations imposed by our DBMS, something deemed feasible by SQL’s expressiveness, staying loyal to the “there's more than one way to do it” principle.

SQLW

Other SQL Workshops

SQL Workshop - Selecting columns without including a non-aggregate column in the group by clause

SQL Workshop - Subselects And Join

More SQL Workshop soon!

SQL Server: Quickly get row counts for tables, heaps, indexes, and partitions

The First Things I Look At On A SQL Server – Part 1

The First Things I Look At On A SQL Server – Part 2

A Generic SQL Server Compression Utility

Codd and his Rules

To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, Facebook, Google+ or Linkedin, or sign up for our weekly newsletter.

Comments

or email your comment to: comments@i-programmer.info

Parasoft Adds AI Assistant To C/C++ Test
30/06/2025

Parasoft has updated its C/C++ Test software with an AI-powered documentation assistant, along with complete support for MISRA C:2025 and auto-suppression of equivalent violations. C/C++ Test can be u [ ... ]

+ Full Story

SQL and Assembly Language In Decline In TIOBE Index
09/06/2025

This month's TIOBE Index is out and the headline asks the question "Where is SQL Going?" Its chart provides the answer of "down", a fate that also applied to Assembly Language. The language that is mo [ ... ]

+ Full Story

More News

Last Updated ( Thursday, 19 December 2013 )

Recent Articles

Recent Book Reviews

Popular Articles

The Problem

bought_master

bought_detail

sold_detail

desired result set

Another Way

Other SQL Workshops

Related Articles

Comments