A couple of thoughts:
1) You should definitely be looking at partitioning a cube at that size. See
the Analysis Services perf guide for why:
http://www.microsoft.com/technet/pro.../ansvcspg.mspx
This requires a SQL Server 2000 Enterprise Edition license.
2) You should be careful in your design around when you will need to do a
full reprocess of the data as it will take a considerable period of time.
For example, if you have a non-changing dimension in the cube design, and
you do a full process of that dimension, then all cubes will knocked
off-line until you reprocess them from scratch. With a cube your size, this
is hours and hours -- so you will need to plan activities such as this.
3) As a very general rule of thumb, I use a million rows per minute as a
good server-class machine throughput for processing a cube partition. Now
you can get better than that using fast I/O devices, processing partitions
in parallel, etc., but 1 million / hr is a good planning number. You will
need to run the Optimize Schema wizard on it to reduce the complexity of the
SQL statement used by process (again, see the perf guide above), so that you
get as close to a table scan as possible on the RDBMS side for the
processing SQL. Using that rule of thumb, if you have 5 years of fact data,
let's say 60 monthly partitions, you are looking at a billion combined
records, or about 1000 mintues, or approx 17 hours of processing to
re-process the entire cube.
4) A good best practice is that you prototype a smaller environment, such as
6 months worth of data and then linearly extrapolate the full dataset
reprocessing. Analysis Services is good at that because partition processing
is pretty much linear.
That being said, with careful planning, I've seen systems with billions and
billions of combined fact table records. So what you describe is very
doable.
--
Dave Wickert [MS]
dwickert (AT) online (DOT) microsoft.com
Program Manager
BI Practices Team
SQL BI Product Unit (Analysis Services)
--
This posting is provided "AS IS" with no warranties, and confers no rights.
"Mark" <anonymous (AT) discussions (DOT) microsoft.com> wrote
Quote:
I'm working through a pilot scenario and am wondering how big is a "Big"
fact table. I have about 190,000,000 rows in the fact table for just one
|
year's worth of data and wonder if that will be a problem when I get 2 or 3
years (or if it might be a problem now).
Quote:
Any insight would be great - I suppose I could try to redesign the data or
preaggregate the items, but just wondered what people had found.
Thanks in advance for any help.
Mark |