On Nov 8, 12:04 am, yf... (AT) vtn1 (DOT) victoria.tc.ca (Malcolm Dew-Jones)
wrote:
Quote:
Robert (irishhac... (AT) gmail (DOT) com) wrote:
: Does Oracle have anything comparable to the data transformation
: procedures in SAS/BASE? In particular, the SAS datastep? ( SAS/BASE is
Yes, SQL (I'm serious).
I recall a really long involved SAS program I had to convert.
After reverse engineering it, and analysing what it was realy trying to
accomplish (took a few days for sure), then I was able to write the
equivalent logic with a couple of group-by SQL statements.
I guess if all you're doing is feeding data into the statistical packages
then it might be harder to reproduce the results. |
I think that's a one sided view, although there are cases where the
SQL SELECT statement is a lot easier to write than the equivalent SAS
datastep, there are many data transformation problems for which the
reverse is true. I have spent more than eight years preparing data for
analysis, and generating complex statistical tables for clinical trial
research - so I understand that the SAS programming language has it's
problems. The SQL programming language does not solve those problems -
it has problems of it's own.
The data transformation programming that the stat analyst has to do ,
whether creating analysis datasets from raw datasets, or creating
analyses and tables, is considerable. I would like to see someone try
to do it all in SQL, with no use of SAS transformations ( mostly the
SAS datastep, but also the TRANSPOSE/MEAN/FREQ procedures ). It would
be a great experiment - but I know the results would be a disaster.
The SQL syntax was developed within the constraints of the relational
model - for data query that's great, but for data preparation/
transformation there's just too many things the statistical programmer
is not allowed to do. These restrictions can cost the programmer a
great deal of time. Leave SAS/STAT aside (that's for the analysis of
data that is already prepared), this is a comparison of SQL and SAS/
BASE, which often boils down to a comparison of the SQL SELECT
statement and the SAS datastep.
Which leads to a comparison of :
SQL
paragraph-style statistical languages ( SAS and SPSS are the most well
known, there's also Vilno)
S ( a statistical language, but completely different from SAS/SPSS )
MDX
PL/SQL and T/SQL
In my opinion, these categories are different tools, made for
different purposes. In the pharmaceutical industry, we use a paragraph-
style statistical language ( SAS ) for data preparation, every day of
the working week. If we tried to do the same thing in SQL or S , it
would be a disaster. Oh yes - we have productivity problems, big time.
But so far, SAS is the least bad choice of language for this type of
work. ( It's possible to develop much better. )
Back to the first question: Oracle and SAS are competing directly in a
more open manner than before. You can never do analysis without the
data preparation first. For data transformation, what does Oracle
have? Do they have something comparable, or if they don't have it, are
they going to get it ( in-house or through acquisitions)?