Menu Bar

Friday, November 18, 2011

Flattening Your Data - How to Make Multiple Rows into Columns


"Mercy is the mark of a great man ... <stab> ...Guess I'm just a good one ... <stab> ... Well, I'm alright" - Captain Malcolm Reynolds, Firefly - 'Shindig'.

I recent had to extract some data from my database. The general format of the tables was like so:

Supplier Table:

supplier_code     char(6)
supplier_name     char(40)
supplier address  char(40)
sup_status        char(1)
etc


Colloquial Name Table:

supplier_code     char(6)
colloquial_name   char(40)


Email Address Table:

supplier_code     char(6)
sup_email_address char(40)

The base requirement was to output one supplier per line with the base information on each supplier from the first table and from 0 through 6 colloquial names for the supplier, depending on how many were present. There was also an optional email address. Finally, there was a status code on the table and I had to include every supplier with a status of "3" or less.

I'll be the first to admit that I do not fully understand the depth of the OVER / PARTITION BY syntax but I get the gist of it. I ended up needing to get some help on this one but the end query looked like this:

select a.supplier_code, a.supplier_name, a.supplier_address, 
       d.sup_email_addr, g.colloq_1, g.colloq_2, g.colloq_3, 
       g.colloq_4, g.colloq_5, g.colloq_6
FROM SUPPLIER a
LEFT OUTER JOIN SUP_EMAIL D on a.supplier_code = d.supplier_code
LEFT OUTER JOIN 
(
select a.supplier_code, 
 MAX(CASE WHEN ClqOrder = 1 THEN Colloquial_name ELSE ' ' END) AS colloq_1,
 MAX(CASE WHEN ClqOrder = 2 THEN Colloquial_name ELSE ' ' END) AS colloq_2,
 MAX(CASE WHEN ClqOrder = 3 THEN Colloquial_name ELSE ' ' END) AS colloq_3,
 MAX(CASE WHEN ClqOrder = 4 THEN Colloquial_name ELSE ' ' END) AS colloq_4,
 MAX(CASE WHEN ClqOrder = 5 THEN Colloquial_name ELSE ' ' END) AS colloq_5,
 MAX(CASE WHEN ClqOrder = 6 THEN Colloquial_name ELSE ' ' END) AS colloq_6
FROM (
SELECT s.*, B.Colloquial_name, B.ClqOrder
FROM SUPPLIER S
INNER JOIN
(SELECT supplier_code, Colloquial_Name, 
        ROW_NUMBER() OVER (PARTITION BY supplier_code ORDER BY Colloquial_name) AS ClqOrder
   FROM sup_colloq) B
on S.SUPPLIER_CODE = B.SUPPLIER_CODE) A
GROUP by a.Supplier_CODE
) G on a.supplier_CODE = g.supplier_CODE
where a.sup_status <= '3'

So this all works, but how does it break down ?

For simplicity's sake, and the limited width of the blog page, we're going to shorten all those char(40) columns to something much shorter. Let's start with making the tables and populating some base data.

SQL> create table supplier
  2  (supplier_code char(2),
  3   supplier_name char(5),
  4   supplier_address char(5),
  5   sup_status char(1));

Table created.

SQL> create table sup_colloq
  2  (supplier_code char(2),
  3   colloquial_name char(5));

Table created.

SQL> create table sup_email
  2  (supplier_code char(2),
  3   sup_email_addr char(5));

Table created.

SQL> insert into supplier values ('01','Sup#1','Addr1','1');

1 row created.

SQL> insert into supplier values ('02','Sup#2','Addr2','1');

1 row created.

SQL> insert into supplier values ('03','Sup#3','Addr3','1');

1 row created.

SQL> insert into sup_email values ('01','a@a');

1 row created.

SQL> insert into sup_email values ('02','b@b');

1 row created.

SQL> insert into sup_colloq values('01','Cql01');

1 row created.

SQL> insert into sup_colloq values('01','Cql02');

1 row created.

SQL> insert into sup_colloq values('01','Cql03');

1 row created.

SQL> insert into sup_colloq values('01','Cql04');

1 row created.

SQL> insert into sup_colloq values('01','Cql05');

1 row created.

SQL> insert into sup_colloq values('01','Cql06');

1 row created.

SQL> insert into sup_colloq values('03','Cql07');

1 row created.

SQL>  insert into sup_colloq values('03','Cql08');

1 row created.

Running the big query gives us the expected output:

SQL> select a.supplier_code, a.supplier_name, a.supplier_address, 
  2         d.sup_email_addr, g.colloq_1, g.colloq_2, g.colloq_3, 
  3         g.colloq_4, g.colloq_5, g.colloq_6
  4  FROM SUPPLIER a
  5  LEFT OUTER JOIN SUP_EMAIL D on a.supplier_code = d.supplier_code
  6  LEFT OUTER JOIN 
  7  (
  8  select a.supplier_code, 
  9   MAX(CASE WHEN ClqOrder = 1 THEN Colloquial_name ELSE ' ' END) AS colloq_1,
 10   MAX(CASE WHEN ClqOrder = 2 THEN Colloquial_name ELSE ' ' END) AS colloq_2,
 11   MAX(CASE WHEN ClqOrder = 3 THEN Colloquial_name ELSE ' ' END) AS colloq_3,
 12   MAX(CASE WHEN ClqOrder = 4 THEN Colloquial_name ELSE ' ' END) AS colloq_4,
 13   MAX(CASE WHEN ClqOrder = 5 THEN Colloquial_name ELSE ' ' END) AS colloq_5,
 14   MAX(CASE WHEN ClqOrder = 6 THEN Colloquial_name ELSE ' ' END) AS colloq_6
 15  FROM (
 16  SELECT s.*, B.Colloquial_name, B.ClqOrder
 17  FROM SUPPLIER S
 18  INNER JOIN
 19  (SELECT supplier_code, Colloquial_Name, 
 20          ROW_NUMBER() OVER (PARTITION BY supplier_code ORDER BY Colloquial_name) AS ClqOrder
 21     FROM sup_colloq) B
 22  on S.SUPPLIER_CODE = B.SUPPLIER_CODE) A
 23  GROUP by a.Supplier_CODE
 24  ) G on a.supplier_CODE = g.supplier_CODE
 25  where a.sup_status <= '3'
 26  
SQL> /

SU SUPPL SUPPL SUP_E COLLO COLLO COLLO COLLO COLLO COLLO
-- ----- ----- ----- ----- ----- ----- ----- ----- -----
01 Sup#1 Addr1 a@a   Cql01 Cql02 Cql03 Cql04 Cql05 Cql06
03 Sup#3 Addr3       Cql07 Cql08
02 Sup#2 Addr2 b@b

So now we see what it did, let's break it down. The first small select gets our colloquial names with row numbers in an inline table. These row numbers will help us later.

SQL> SELECT supplier_code, Colloquial_Name, 
  2          ROW_NUMBER() OVER (PARTITION BY supplier_code ORDER BY Colloquial_name) AS ClqOrder
  3     FROM sup_colloq;

SU COLLO   CLQORDER
-- ----- ----------
01 Cql01          1
01 Cql02          2
01 Cql03          3
01 Cql04          4
01 Cql05          5
01 Cql06          6
03 Cql07          1
03 Cql08          2

8 rows selected.

When we grab the next chunk of code, with the CASE / MAX, we can query on each of the ClqOrder columns and turn that into something a little more friendly to what we need:

SQL> select a.supplier_code, 
  2   MAX(CASE WHEN ClqOrder = 1 THEN Colloquial_name ELSE ' ' END) AS colloq_1,
  3   MAX(CASE WHEN ClqOrder = 2 THEN Colloquial_name ELSE ' ' END) AS colloq_2,
  4   MAX(CASE WHEN ClqOrder = 3 THEN Colloquial_name ELSE ' ' END) AS colloq_3,
  5   MAX(CASE WHEN ClqOrder = 4 THEN Colloquial_name ELSE ' ' END) AS colloq_4,
  6   MAX(CASE WHEN ClqOrder = 5 THEN Colloquial_name ELSE ' ' END) AS colloq_5,
  7   MAX(CASE WHEN ClqOrder = 6 THEN Colloquial_name ELSE ' ' END) AS colloq_6
  8  FROM (
  9  SELECT s.*, B.Colloquial_name, B.ClqOrder
 10  FROM SUPPLIER S
 11  INNER JOIN
 12  (SELECT supplier_code, Colloquial_Name, 
 13          ROW_NUMBER() OVER (PARTITION BY supplier_code ORDER BY Colloquial_name) AS ClqOrder
 14     FROM sup_colloq) B
 15  on S.SUPPLIER_CODE = B.SUPPLIER_CODE) A
 16  GROUP by a.Supplier_CODE;

SU COLLO COLLO COLLO COLLO COLLO COLLO
-- ----- ----- ----- ----- ----- -----
01 Cql01 Cql02 Cql03 Cql04 Cql05 Cql06
03 Cql07 Cql08

Finally, adding in the outer joins as show at the top, we pick up the grand combination of everything. This was definitely something I've had to do before and thought it was worth preserving here. It falls squarely int the "for my edification" bucket that was talked about in the first post :)

Monday, November 7, 2011

T-SQL Tuesday #024: Prox 'n' Funx


"It doesn't matter if you win or lose, as long as you look cool doing it." - Julio Scoundrél - Order of the Stick webcomic.
T-SQL TuesdayI only recently became aware of T-SQL Tuesday and considering my last post on Joins, if I had known about it, that post would have made a good submission for last month's T-SQL Tuesday. However, that aside, it's not last month, it's this month.

Brad Schulz (blog) is hosting this month's party and his topic of choice is "Prox 'n' Funx" (Procedures and Functions).

So here we go.

Anyone who spent any time learning anything about just about any kind of programming language (there are exceptions there too) should have learned early on about the usefulness of procedures and functions. The Wikipedia entry on this topic defines it as follows: "a subroutine (also called procedure, function, routine, method, or subprogram) is a portion of code within a larger program that performs a specific task and is relatively independent of the remaining code". This definition holds true in SQL Server - however instead of being a part of a larger program, they are stored in the database as executable pieces of code - and these can be called by any other T-SQL program in the database - indeed you can even call them across databases (let's face it, the majority of system procedures are stored in a system database).

With Programming 101 out of the way, I thought I'd write a little about some of the differences between Oracle and SQL Server in Procedure and Function definition for those people (like me), who have to work in both database engines. Since I recently had to port some PL/SQL code to T-SQL, it seems like a good time to write about it.

The differences aren't so much in how the procedure is defined as much as how you call it.

The basic construct for both languages is the same:

CREATE [or replace] PROCEDURE schema.procname(parameters)
AS

The schema part of the declaration is optional. In SQL Server it defaults to the default schema (usually dbo) and in Oracle it's the user name you're logged in as. (If you're not familiar with Oracle's security model you'll have to trust me, that's a post for another time and I won't go into it here). Oracle also allows the "OR REPLACE" syntax to overlay an existing stored procedure - as opposed to SQL Server where you need to drop an object before recreating it (or you can use "alter procedure" to replace the procedure so that permissions on the object are not lost).

Definition of the parameters varies only slightly.

For Oracle, list the parameter names and types, each declaration separated by a comma:

CREATE PROCEDURE myproc1 (param1 char(10), param2 integer, param3 date)
AS

The SQL Server syntax is almost exactly the same, except you need to prefix the parameter names with the "@" sign.

CREATE PROCEDURE myproc1 (@param1 char(10), @param2 int, @param3 datetime)
AS

The function definitions are also very close. The syntax for the returning of the function value differs only slightly. In Oracle it looks like this:

CREATE FUNCTION myfunc1 (param1 date) RETURN DATE
AS

The function definitions in SQL Server looks like this:

CREATE FUNCTION myfunc1 (@param1 datetime) RETURNS DATETIME
AS

The big difference between the two is how they are called. This was something that tricked me up the first time I tried converting code between the two. Both expect a comma separated parameter list, however, Oracle expects the parameters to be enclosed in parentheses, this SQL Server does not. What we come down to, is this for Oracle: The function definitions are also very close. The syntax for the returning of the function value differs only slightly. In Oracle it looks like this:

EXEC myproc1 (param1, param2, param3);

while in SQL Server it looks like this:

EXEC myproc1 @param1, @param2, @param3;

This all gets a little murkier when you realize that in SQL Server user functions are called differently. A user function has to be called using 2 part names (you have to call it as dbo.myfunc and not just myfunc) *and* parameters are passed in parentheses, not as strings passed after a procedure call. This is yet another "gotcha" for the cross-database person.

Example time. In SQL Server it looks like this.

create procedure dbo.test1 (@date1 datetime, @days1 int)
as
  declare @result1 datetime;
  select @result1 = dateadd(dd, @days1, @date1);
  print 'Input : ' + convert(varchar(10), @date1, 101);
  print 'Output: ' + convert(varchar(10), @result1, 101);
go

Command(s) completed successfully.

declare @indate1 datetime;
select @indate1=getdate();
exec dbo.test1 @indate1,2

Input : 11/07/2011
Output: 11/09/2011

And the same kind of thing in Oracle:

create or replace procedure test1 (date1 date, days1 integer)
as
  result1 date;
begin
  result1 := date1 + days1;
  dbms_output.put_line('Input : ' || to_char(date1, 'MM/DD/YYYY'));
  dbms_output.put_line('Output: ' || to_char(date1, 'MM/DD/YYYY'));
end;
/

Procedure created.

SQL> set serveroutput on
SQL> exec test1(sysdate, 2);

Input : 11/07/2011
Output: 11/09/2011

PL/SQL procedure successfully completed.

SQL> 

The result are the same, but there's the little syntactical tricks that are needed in order to make it work.

Finally, same treatment for similar functions. SQL Server first.

create function dbo.func1 (@date1 datetime, @days1 int) 
returns datetime
as
begin
  declare @result1 datetime;
  select @result1 = dateadd(dd, @days1, @date1);
  RETURN @result1
end
GO

Command(s) completed successfully.

declare @indate1 datetime;
declare @outdate1 datetime;
select @indate1=getdate();
select @outdate1=dbo.func1(@indate1, 3)
print 'Input : ' + convert(varchar(10), @indate1, 101);
print 'Output: ' + convert(varchar(10), @outdate1, 101);

Input : 11/07/2011
Output: 11/10/2011

As noted above, if you replace "dbo.func1" with "func1", you get this error message:

Msg 195, Level 15, State 10, Line 4
'func1' is not a recognized built-in function name.

And finally, for Oracle:

create or replace function func1 (date1 date, days1 integer)
return date
as
  result1 date;
begin
  result1 := date1 + days1;
  return result1;
end;
/
Function created.

SQL> select 'Input Date = '||sysdate||' / Output Date = '||func1(sysdate, 3)
  2      as Input_Output_Date from dual;

INPUT_OUTPUT_DATE
------------------------------------------------
Input Date = 07-NOV-11 / Output Date = 10-NOV-11

SQL> 

When working in two different databases, it's important to keep in mind the differences in syntax between them when working on whatever solution you are working on. Little things like a missing pair of parentheses, or missing @'s can cause you just enough irritation to make you want to pull out what in my case, is the very little hair left on your head. The bottom line, especially for people like me who walk both sides of the Oracle / SQL Server line is to make sure you understand the subtle differences in the syntax you're working with.

As an "end of post" bonus, just because we're talking about functions, there's something else I'd like to throw out there. Here's a post I wrote in September and how a compatibility setting can "break" a Dynamic Management Function - sys.dm_db_index_physical_stats is broken !?

Sunday, November 6, 2011

Addendum to "My database went down during a hot backup"


"This place gives me an uncomfortableness." - Jayne Cobb, Firefly - 'Safe'
In checking the archives of my blog, I noticed this posting from September.

There's another way to fix this and for completeness, thought I should post it

Once you've attempted to start the database, you'll get as far as "mounted' before the error message appears. If you can check your alert logs and take the affected tablespaces out of backup mode:

alter tablespace MYTS end backup;

all you'll need to do is to issue "ALTER DATABASE OPEN;" and the database will open.

Either method will work. I've had the opportunity to use both in the last couple of months.