Should I use OLAP for time series analysis -
11-04-2004
, 07:34 PM
Hello
I have a poorly structured, poorly featured application
based on Excel . I need a a more sophisticated database back-end that is?
streamlined for my application and viable for the long term. Somebody
suggested
what I need is an OLAP database with an Excel front end.
Here's an outline on what I am trying to do. Much appreciate any
suggestions what approach/platforms I might consider.
The application involves lots of time stamped data that is statistically
driven and searched for time-series relationships via User interface
1. Currently the data is stored across multiple Excel
folders/workbooks/worksheets. I couldn't do it any other way with Excel
due to performance limitation - I therefore need to store elsewhere:
OLAP database?,
something like one Table per Excel worksheet. Alltogether there could
be hundreds of
worksheets with up to 10 000 rows x 100 columns of data
2.All the rows are date stamped and the
sequential order of the rows are critical to the application.
All the rows across every worksheet (belonging to specified set of
workbooks/folders) are date
synchronized, i.e. every row1 has the same date. Every day (every hour
in the future)
I get a new row of data for all the worksheets. Except for the date all
the other cell
values are numbers that are drawn from an external source or
statistically derived in Excel
2. Each Excel column has five headers (Floor.Room.Shelf.Book..
hierarchical concept). These headers are meaningful to the User as she
specifies them and uses them in equations to derive more new
statistics down new columns. The data for the column headers she
specifies are dropped onto the worksheet from special VBA Array. The
statistics are calculated by Excel on the worksheet. I now want to do
some of the calculations in VBA for special equations. Say the User
inputs something like X(5) + Y(1). Here I need to associate the symbol
X with the five part column header (so I can use it to identify the
column in the database or Array). The number refers to the offset from
the current row . So X(-5) refers to the cell five rows (i.e. five days)
below (or above) the current row.
3. Once I have the base data I want to do a lot of queries, searching
for timeseries relationshiops. Something
like: Sort the values of parameter X (from Start date to Stop date) to
into ten bins (e.g. histogram) and store them For all the values of X
that fall in Bin1 find the corresponding values (by date) for parameters
Y and Z. Or, for all of the values of X that falls into Bin1 and Y that
falls into Bin2 find the values of Z. Now lets sort the values we just
found for Z into ten bins. Or, look at data every 5th day beginning from
Start date. (Then plot and chart the results.)
I have to retain the Floor.Room.Shelf...with date sequenced rows
structure only from the Users perspective.The User specifies existing
columns (for which data must be extracted from the database), defines
new columns headings and equations to create the new data. She
initiates Queries like those described above.The User may review the
results that fall on these worksheets and store them in the database if
desired for later review.
Appreciate any suggestions,
Kurb |