Some of the methods I have created have been to UNION multiple tables holding the data for different groups, building derived tables, doing the equivalent with CTE, and probably many others. I've tried using stored procedures as well, but invariably each request is an ad-hoc one, and the stored procedure approach led nowhere because of it.
Then came 'CUBE'. I have settled off on using this because it very easily and readily creates the information I need. (Now if Microsoft would only improve the PIVOT operation so that multiple aggregates could be handled... but I digress).
Here is an example of how to use a CUBE operation to simplify the summation of sales data for a report of customers' sales YTD.
First we need to build a customer table....
-- create some customers
if exists(select * from sysobjects where type = 'U' and name = 'customer')
drop table customer
;
SELECT '28549'[billing #], '28549'[customer #], cast('ABC Corporation' as nvarchar(255))[name], cast('123 Anywhere Place' as nvarchar(255))[Address],
cast('Irmo' as nvarchar(255))[City], 'SC'[St]
into customer;
insert into customer SELECT '32568'[billing #], '32568'[customer #], 'ABC Corporation'[name], '123 Anywhere Place'[Address], 'Irmo'[City], 'SC'[St];
insert into customer SELECT '46598'[billing #], '46598'[customer #], 'Hannibal Cartons'[name], '1235 Courtney Blvd.'[Address], 'Walterboro'[City], 'SC'[St];
insert into customer SELECT '84956'[billing #], '84956'[customer #], 'Quattle Container'[name], '3857 Cedar Road'[Address], 'Lexington'[City], 'SC'[St];
insert into customer SELECT '51324'[billing #], '51324'[customer #], 'Simmons Freightlines'[name], '456 Anywhere Place'[Address], 'Irmo'[City], 'SC'[St];
select * from customer
;
| billing # | customer # | name | Address | City | St | 
| 28549 | 28549 | ABC Corporation | 123 Anywhere Place | Irmo | SC | 
| 32568 | 32568 | ABC Corporation | 123 Anywhere Place | Irmo | SC | 
| 46598 | 46598 | Hannibal Cartons | 1235 Courtney Blvd. | Walterboro | SC | 
| 84956 | 84956 | Quattle Container | 3857 Cedar Road | Lexington | SC | 
| 51324 | 51324 | Simmons Freightlines | 456 Anywhere Place | Irmo | SC | 
Next we need to build some invoice data. Being the pragmatic sort, since I don't need to provide breakdowns by items sold, I don't need the line data, so I'll be using data from the invoice header table only.
-- create some invoice headers
if exists(select * from sysobjects where type = 'U' and name = 'invoicehdr')
drop table invoicehdr
;
select '46598'[billing #], '46598'[customer #], cast('05-11-2012' as datetime)[invoice-date], 1256.55[gross], cast(1256.55*.05 as decimal(10,2))[tax] into invoicehdr
insert into invoicehdr select '28549'[billing #], '28549'[customer #], cast('06-30-2012' as datetime)[invoice-date], 125.36[gross], cast(125.36*.05 as decimal(10,2))[tax]
insert into invoicehdr select '28549'[billing #], '28549'[customer #], cast('07-12-2012' as datetime)[invoice-date], 326.10[gross], cast(326.10*.05 as decimal(10,2))[tax]
insert into invoicehdr select '28549'[billing #], '28549'[customer #], cast('07-28-2012' as datetime)[invoice-date], 42.10[gross], cast(42.10*.05 as decimal(10,2))[tax]
insert into invoicehdr select '32568'[billing #], '32568'[customer #], cast('01-21-2012' as datetime)[invoice-date], 442.30[gross], cast(442.30*.05 as decimal(10,2))[tax]
insert into invoicehdr select '32568'[billing #], '32568'[customer #], cast('03-19-2012' as datetime)[invoice-date], 442.30[gross], cast(442.30*.05 as decimal(10,2))[tax]
insert into invoicehdr select '32568'[billing #], '32568'[customer #], cast('04-19-2012' as datetime)[invoice-date], 442.30[gross], cast(442.30*.05 as decimal(10,2))[tax]
insert into invoicehdr select '32568'[billing #], '32568'[customer #], cast('05-19-2012' as datetime)[invoice-date], 442.30[gross], cast(442.30*.05 as decimal(10,2))[tax]
insert into invoicehdr select '32568'[billing #], '32568'[customer #], cast('06-19-2012' as datetime)[invoice-date], 442.30[gross], cast(442.30*.05 as decimal(10,2))[tax]
insert into invoicehdr select '32568'[billing #], '32568'[customer #], cast('07-19-2012' as datetime)[invoice-date], 442.30[gross], cast(442.30*.05 as decimal(10,2))[tax]
insert into invoicehdr select '46598'[billing #], '46598'[customer #], cast('01-19-2012' as datetime)[invoice-date], 546.30[gross], cast(546.30*.05 as decimal(10,2))[tax]
insert into invoicehdr select '46598'[billing #], '46598'[customer #], cast('03-10-2012' as datetime)[invoice-date], 342.20[gross], cast(342.20*.05 as decimal(10,2))[tax]
insert into invoicehdr select '51324'[billing #], '51324'[customer #], cast('03-14-2012' as datetime)[invoice-date], 156.89[gross], cast(156.89*.05 as decimal(10,2))[tax]
insert into invoicehdr select '51324'[billing #], '51324'[customer #], cast('04-23-2012' as datetime)[invoice-date], 423.65[gross], cast(423.65*.05 as decimal(10,2))[tax]
select * from invoicehdr
;
| billing # | customer # | invoice-date | gross | tax | 
| 46598 | 46598 | 5/11/2012 | 1256.55 | 62.83 | 
| 28549 | 28549 | 6/30/2012 | 125.36 | 6.27 | 
| 28549 | 28549 | 7/12/2012 | 326.1 | 16.31 | 
| 28549 | 28549 | 7/28/2012 | 42.1 | 2.11 | 
| 32568 | 32568 | 1/21/2012 | 442.3 | 22.12 | 
| 32568 | 32568 | 3/19/2012 | 442.3 | 22.12 | 
| 32568 | 32568 | 4/19/2012 | 442.3 | 22.12 | 
| 32568 | 32568 | 5/19/2012 | 442.3 | 22.12 | 
| 32568 | 32568 | 6/19/2012 | 442.3 | 22.12 | 
| 32568 | 32568 | 7/19/2012 | 442.3 | 22.12 | 
| 46598 | 46598 | 1/19/2012 | 546.3 | 27.32 | 
| 46598 | 46598 | 3/10/2012 | 342.2 | 17.11 | 
| 51324 | 51324 | 3/14/2012 | 156.89 | 7.84 | 
| 51324 | 51324 | 4/23/2012 | 423.65 | 21.18 | 
Now we have a customer table and we have some invoice data for those customers; we are ready to create a sales ytd report.
Using the CUBE operation, we can easily and rapidly get our summary data as shown below:
select
case when grouping(i.[billing #])=1 then 'All' else [billing #] end [billing #],
case when grouping(i.[customer #])=1 then 'All' else [customer #] end [customer #],
sum(i.gross)gross,
sum(i.tax)tax
from invoicehdr as i with (nolock)
group by i.[billing #], i.[customer #]
with cube
order by [billing #], [customer #]
;
To use the 'with cube' operation, you need to use the GROUPING() statement on the columns you want to group by. The GROUPING statement will return a value of 1 for the group and a 0 for each element in the group. However, I like to case my GROUPING() statements to return the word, 'All' for the total and then the data for each element in the group. It just makes more sense to me. You can use whatever you choose, that's one of the points of being pragmatic.
The query returns the following answer set:
| billing # | customer # | gross | tax | 
| 28549 | 28549 | 493.56 | 24.69 | 
| 28549 | All | 493.56 | 24.69 | 
| 32568 | 32568 | 2653.8 | 132.72 | 
| 32568 | All | 2653.8 | 132.72 | 
| 46598 | 46598 | 2145.05 | 107.26 | 
| 46598 | All | 2145.05 | 107.26 | 
| 51324 | 51324 | 580.54 | 29.02 | 
| 51324 | All | 580.54 | 29.02 | 
| All | 28549 | 493.56 | 24.69 | 
| All | 32568 | 2653.8 | 132.72 | 
| All | 46598 | 2145.05 | 107.26 | 
| All | 51324 | 580.54 | 29.02 | 
| All | All | 5872.95 | 293.69 | 
Here is how to filter them out easily... A good YTD report is going to show more data about the customer than their billing and customer numbers, so we want to perform a join of the CUBE query answer set and the customer table as shown below:
select cb.[billing #], cb.[customer #], c.[name], c.[Address], c.City, c.St, cb.gross, cb.tax
from
(
select
case when grouping(i.[billing #])=1 then 'All' else [billing #] end [billing #],
case when grouping(i.[customer #])=1 then 'All' else [customer #] end [customer #],
sum(i.gross)gross,
sum(i.tax)tax
from invoicehdr as i with (nolock)
group by i.[billing #], i.[customer #]
with cube
) cb
inner join customer as c with (nolock) on cb.[billing #]=c.[billing #] and cb.[customer #]=c.[customer #]
order by gross desc
;
Since I know there is no billing # or customer # data with the value "All" the inner join is going to filter out the group and subgroup total rows out of my cube. Since the billing and customer numbers are identical in this data set, and in the interest of having space on my post page for a tidy report, I'll eliminate the billing # from the answer set:
| customer # | name | Address | City | St | gross | tax | |
| 32568 | ABC Corporation | 123 Anywhere Place | Irmo | SC | $ 2,653.80 | $ 132.72 | |
| 46598 | Hannibal Cartons | 1235 Courtney Blvd. | Walterboro | SC | $ 2,145.05 | $ 107.26 | |
| 51324 | Simmons Freightlines | 456 Anywhere Place | Irmo | SC | $ 580.54 | $ 29.02 | |
| 28549 | ABC Corporation | 123 Anywhere Place | Irmo | SC | $ 493.56 | $ 24.69 | 
Since no billing # matches the 'All' value in the cube data set, and likewise for the customer #, as I wrote above, these rows are eliminated from the report. All we have are the total sales year-to-date for the customers in our customer table.
This is so much simpler than any other method I've ever used. If you don't know about the CUBE and ROLLUP clauses you need to check them out. They're tremendous time savers (and head ache savers)!!
 
No comments:
Post a Comment