dbTalk Databases Forums  

Transfer Techniques.

microsoft.public.sqlserver.olap microsoft.public.sqlserver.olap


Discuss Transfer Techniques. in the microsoft.public.sqlserver.olap forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Michael Morse
 
Posts: n/a

Default Transfer Techniques. - 10-08-2003 , 08:30 AM






When I am transferring lets say a 'Product Class' table
from the production servers to the data warehouse. is it
better practice to always delete the current product
classes in the table and then re add the ones from
production servers? Or, would it be better to look for
the changes between production and data warehouse and only
add or update those that need to be?

It seems that it is easier to just delete all the records
in Product Class and then re-add but what about over all
efficiency of the SQL server pertaining particularly to
storage abilities. (Page splits, heaps and stuff like
that).

Since product class generally seems to be a small table
would you use the same method for say a customer table
which for us contains upwards of 7000 records. To me, it
just seems to delete and re-add these on a nightly basis
would almost cause SQL fragmentation if you will. Any
thoughts on this.

-mike

Reply With Quote
  #2  
Old   
rharmon
 
Posts: n/a

Default Re: Transfer Techniques. - 10-08-2003 , 10:27 AM






I do understand your concerns about fragmentation, but there may be more at
stake.

As a rule, I almost never delete data in a warehouse, and try to keep
updates to temporal fields only (start date, end date of a record)

Approaching the issue from a logical perspective rather than physical,
deleting data or even updating data in a warehouse can cause long term side
effects and you bring up two datasets that may be more susceptible than
others; products and customers.

As a quick "for instance" lets assume we are analyzing product sales by
customer region. In the product table, we have a price value, in the
customer table we have location information (city, state etc.) Both of these
attributes are going to be necessary to calculate product sales by region.
Seems simple enough, however, if our warehouse scheme doesn't take into
account that the price of a given product will change and the address of a
given customer will change the resulting capability of the warehouse becomes
quite limited. Today, I calculate this metric and come up with a given
result set. Once prices change or customers move, I can't accurately
calculate that same metric for that same time period.

So, what's my suggestion? Stage the data in a staging database, add a
from_date_time and to_date_time to the records in the warehouse. If a record
from the source system does not match the "current" record in the warehouse,
update the to_date_time in the warehouse and insert the new record. You may
also want to institute a strong surrogate key pipeline to make querying
easier rather than complex joins on from-todates.



"Michael Morse" <mike.morse (AT) micromo (DOT) com> wrote

Quote:
When I am transferring lets say a 'Product Class' table
from the production servers to the data warehouse. is it
better practice to always delete the current product
classes in the table and then re add the ones from
production servers? Or, would it be better to look for
the changes between production and data warehouse and only
add or update those that need to be?

It seems that it is easier to just delete all the records
in Product Class and then re-add but what about over all
efficiency of the SQL server pertaining particularly to
storage abilities. (Page splits, heaps and stuff like
that).

Since product class generally seems to be a small table
would you use the same method for say a customer table
which for us contains upwards of 7000 records. To me, it
just seems to delete and re-add these on a nightly basis
would almost cause SQL fragmentation if you will. Any
thoughts on this.

-mike



Reply With Quote
  #3  
Old   
Bill Cheng [MSFT]
 
Posts: n/a

Default RE: Transfer Techniques. - 10-09-2003 , 05:52 AM



Hi Michael,

I just wonder whether the fragmentation is so heavy that you need to
perform the deletion and insertion on nightly basis? Have you used DBCC
SHOWCONTIG to check? Generally according to my experience, high values for
logical scan fragmentation can lead to degraded performance of index scans.
In the tests, workload performance increased after defragmenting when
clustered indexes had logical fragmentation greater than 10 percent, and
significant increases were attained when logical fragmentation levels were
greater than 20 percent. Consider defragmenting indexes with 20 percent or
more logical fragmentation. Remember that this value is meaningless when
reporting on a heap (Index ID = 0).

http://www.microsoft.com/technet/tre...hnet/prodtechn
ol/sql/maintain/Optimize/SS2KIDBP.asp

If you think the cost of deletion and insertion of the whole table is
better than some fragmentation of the table, you may do it. You may need to
experiment with different settings.


Bill Cheng
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
--------------------
Quote:
Content-Class: urn:content-classes:message
From: "Michael Morse" <mike.morse (AT) micromo (DOT) com
Sender: "Michael Morse" <mike.morse (AT) micromo (DOT) com
Subject: Transfer Techniques.
Date: Wed, 8 Oct 2003 06:30:26 -0700
Lines: 22
Message-ID: <2541001c38da0$555bcff0$a601280a (AT) phx (DOT) gbl
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Newsreader: Microsoft CDO for Windows 2000
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300
Thread-Index: AcONoFVb0Zq+hxG4QNyCWV4/hlU6DA==
Newsgroups: microsoft.public.sqlserver.olap
Path: cpmsftngxa06.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.sqlserver.olap:43591
NNTP-Posting-Host: TK2MSFTNGXA14 10.40.1.166
X-Tomcat-NG: microsoft.public.sqlserver.olap

When I am transferring lets say a 'Product Class' table
from the production servers to the data warehouse. is it
better practice to always delete the current product
classes in the table and then re add the ones from
production servers? Or, would it be better to look for
the changes between production and data warehouse and only
add or update those that need to be?

It seems that it is easier to just delete all the records
in Product Class and then re-add but what about over all
efficiency of the SQL server pertaining particularly to
storage abilities. (Page splits, heaps and stuff like
that).

Since product class generally seems to be a small table
would you use the same method for say a customer table
which for us contains upwards of 7000 records. To me, it
just seems to delete and re-add these on a nightly basis
would almost cause SQL fragmentation if you will. Any
thoughts on this.

-mike



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.