SQL Workshop – Having Clause With NULLs
Written by Nikos Vaggalis   
Thursday, 05 November 2015

SQL Workshop is where we confront real world scenarios that SQL devs face on a day to day basis, especially when having to maintain legacy systems. This one looks at the problem of handling partially delivered orders.

For this example imagine that you are in charge of an order processing system for a large corperation

Let’s begin with an Order with two items:

 

Order No

Item No

Item Code

Item 

Price per Unit

Quantity ordered

Quantity received

886

1

98001005

Unleaded

1.3624

845.1

711.440

886

2

98003002

Diesel

1.1082

257.8

99.380

 

Each order item can be partially delivered, as shown by the difference between quantity_ordered and quantity_received.

In this case, an order has been placed for 845.1 LT of Unleaded and so far 711.440 LT of it has been received. A second order has been placed for 257.8LT of Diesel and 99.380 LT of it has been received.

 

Also, each Item of each Order can be individually as well as partially delivered. In this case we’ve ordered two items but only item 98001005 has been delivered ,(98003002 has not yet been received at all) although partially, with Delivery numbers 2483 to 2490, with all of them summed up making up the 711.440 LT so far received:

 

2existingdiaqeshs

 

Each delivery has a corresponding delivery confirmation:

 

Delivery 2483:

3exisitingsample

 

 

Delivery 2484:

4exisitingsample

 

Delivery 2485:

5exisitingsample

 

Here is the underlying data model which maps this relation:

erd

Identifying relationship, 1:m, mandatory on Delivery’s end, optional on Order’s end

 

The equivalent result set would be:

Query1

sql

 

 

If we add all entries of quantity_delivered we get a sum of 637.390, therefore there is 711.440 - 637.390= 74.05 LT outstanding.

 

We need the user application to keep track of those sums and be aware of the outstanding quantities. Furthermore we want to present the supplier with the outstanding quantity so that he can either do a single Delivery covering the whole amount or do as many partial deliveries as required to equal that total

 

There are a few ways to go about it from within the user (UI) application:

  1. Have a field somewhere that keeps the total amount received and update upon each new delivery

  2. Retrieve the Delivery rows, loop over them procedurally and sum up the quantities

  3. Call embedded SQL or stored procedure code

We will of course go with the final option and formulate the following SQL:

 

 

 

select d.order_no,

            d.item_code,

            d.quantity_ordered - sum(y.quantity_received)  as     quantity_remaining

from Order d inner join Delivery y on

           y.order_no = d.order_no and

           y.item_code = d.item_code

group by d.order_no,d.item_code, d.quantity_ordered
having (d.quantity_ordered>sum(y.quantity_received))

/*we  also have to include d.quantity_ordered in the GROUP BY
clause because it appears in the SELECT clause and the
Ingres DBMS will not allow otherwise */

 

  

Upon execution we are confronted with an error:

E_US0B68 line 1, Illegal expression in the HAVING clause

which means that we can only use aggregate functions inside the having clause, therefore d.quantity_ordered must be used as an argument of a function of that kind.

 

To find the source of the error means looking back to Query 1.

We see that quantity_ordered is being repeated for each Delivery row, so doing a sum as:

having (sum(d.quantity_ordered)> sum(y.quantity_received))

 would erroneously produce :

(711.440 x 8 > sum(y.quantity_ ordered))

 

What about max ?

having (max(d.quantity_ ordered)> sum(y.quantity_received))

 will always produce:

having (711.440 > sum(y.quantity_ ordered))

 or fully interpolated :

having (711.440 > 637.390)

 which makes max the sole candidate.

 

Rewriting the SQL as:

 

 

select d.order_no,

            d.item_code,

            d.quantity_ordered - sum(y.quantity_received) as quantity_remaining

 from Order d inner join Delivery y on

            y.order_no = d.order_no and

            y.item_code = d.item_code

group by d.order_no,d.item_code, d.quantity_ordered
having (max(d.quantity_ordered)>sum(y.quantity_received));


 

results in:

qrem

 

There is one slight complication, however:

The optionality in the data model allows for there to be an Order without any Delivery yet. This means there could be rows in the Order table for a given order without any related rows in the Delivery table, i.e. sum(y.quantity_received) is NULL. So to play along we must rewrite the query to use a left join and check for NULLS within the having clause:

 

select d.order_no,

            d.item_code,

            ifnull(d.quantity_ordered - sum(y.quantity_received),0) as quantity_remaining

from Order d left join Delivery y on

           y.order_no = d.order_no and

           y.item_code = d.item_code

group by d.order_no,d.item_code, d.quantity_ordered
having (

      max(d.quantity_ordered)>sum(y.quantity_received)
         or
      sum(y.quantity_received) IS NULL

            )

 

Astute readers will also have noticed another way in which the data model is problematic :

quantity_ordered is repeated for every row of the Delivery table as demonstrated by Query1)

something that could have been avoided if a better design was in place.

Nevertheless, inherited cases like this that cannot be rewritten but only patched, offer an excellent testbed for sharpening your problem solving abilities and expanding your SQL skills.

 

SQLW

Other SQL Workshops

SQL Workshop - Removing Duplicate Rows

SQL Workshop - Selecting columns without including a non-aggregate column in the group by clause

SQL Workshop - Subselects And Join 

 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin.

 

 

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Thursday, 05 November 2015 )