Wednesday, December 5, 2012

The need for a better Excel to SQL Server connection...

Just a bit of a rant here...

If there were ever any 2 products from Microsoft that were just made for each other it is Excel and SQL Server... (well, that is, now that the newer xlsx format is available).

But sadly, these two products don't 'schmooze' as easily as they might.

Oh sure there are plenty of ways to 'integrate' Excel to SQL Server, but they are all sadly wanting - to prove that point all you have to do is Google (or Bing, or whatever) something like 'how do I submit parameter queries to SQL Server from Excel' and you'll suddenly see my point here.

All those GUI tools don't do the job well.
Writing .Net or DBO or any of the other alphabet soup of products... well, you can do it, but you really need to know those languages and some 'gotchas' in order to make those work.

If you've ever developed an SSRS report in BIDS, you know how simple and straightforward it is.
Build a data source.  Set the connection for the data source.  Write the query.  Need input parameters?  Define them in the query object.  Done. Now run the query - boom, you have your answer set.

Here's the real question ... Why in the world isn't this kind of functionality provided as a means to integrate Excel to SQL Server?

Surely Microsoft can see that this capability ought to exist for Excel.  If they cannot it is beyond me why they cannot... or maybe it isn't beyond me.

Because after all, if SQL developers had such a tool, they wouldn't need Sharepoint, or all the other middleware tools would they?

They wouldn't need anything at all except Excel and SQL Server...and maybe that's why it won't ever happen?

provisioning for the unanticipated...

I'm in the process of setting up an SSAS cube.

I'm early in the process... tweaking and perfecting the ETL for the fact and dimension tables.

I just ran into something that I should have thought of before, but never had encountered up to now (not because I'm brilliant, no, far from it, it is because up to now, I was just ... plain ... lucky).

Here is what happened.  I was feeling pretty smug as the ETL had been running perfectly for about a week when one day I came in and noticed a failed job in SQL Monitor (Red Gate).

I began to investigate why the job failed.  It failed trying to re-apply an FK create on a table after the table had been updated.

After figuring out why the FK create wouldn't happen, I discovered it was nothing about the syntax or mechanics of the alter table statement but rather a disconcerting relationship disconnect between a fact table and an associated dimension table.  The fact table had exactly 1 row more than should have been there.
That fact table's one extra row had a key value of '0' (put there by a default constraint in the create table statement).  That '0' value told me that for this row, there was no matching value in the dimension table I was trying to relate this table to!  In 'perfect world' that never happens (luck); but in reality world it can and does happen (my luck ran out).

I have solved the origin of that mystery of why and how that extra row got there, but here's the point/suggestion I'm blogging about today.

When you are inserting values into your dimension tables, it may be a good move to create a row for a condition such as the one I describe above.  My dim tables now all have row #1 with keys of '0', associated rows have 'N/A' (text data types) or '0' (number data types).

That way  loads won't fail, and the failure won't be a mind-wracking journey into MSFT's error messages about why a key could not be created (ultimately in my case it wasn't about the key's alter table statement at all, but rather about the data in the 2 tables I was trying to create a relationship for).

You can then also code corrections in the ETL for when the Fact table gets a join to the '0' value data in the key column, but have it done AFTER the load job has completed (and this time it won't fail with an error that has you chasing your tail about why a FK could not be created).