dbTalk Databases Forums  

Advice on middleware products for TRUE Scaling out of SQL Server

comp.databases.ms-sqlserver comp.databases.ms-sqlserver


Discuss Advice on middleware products for TRUE Scaling out of SQL Server in the comp.databases.ms-sqlserver forum.



Reply
 
Thread Tools Display Modes
  #11  
Old   
Stu
 
Posts: n/a

Default Re: Advice on middleware products for TRUE Scaling out of SQL Server - 04-19-2006 , 02:14 PM






Hey Ian, on average we hover around 30%, but it's been a long painful
journey to get to there. And, those are Xeon processors, so to
Windows, it appears as a quad-processor box.


Reply With Quote
  #12  
Old   
Brad
 
Posts: n/a

Default Re: Advice on middleware products for TRUE Scaling out of SQL Server - 04-19-2006 , 02:51 PM






Lan,
Just curious, is this a transactional system or data warehouse or both?
We have extensive experience with scaling out SQL Server, but only
from a data warehousing (ETL and Reporting) perspective and not a
transactional perspective.
Thanks,
Brad


Reply With Quote
  #13  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Advice on middleware products for TRUE Scaling out of SQL Server - 04-19-2006 , 04:27 PM



IanIpp (ian.ippolito (AT) gmail (DOT) com) writes:
Quote:
3) Regarding tuning queries,etc.

Yes, we have control over the code but we already run
extensive/constant query tuning and add/adjust indexes and regularly
use the Database Tuning Advisor (see my post here for some of the
existing bugs I've found in SQL 2005's DTA:
http://rentacoder.com/CS/blogs/real_...03/17/447.aspx
). We also update statistics and defrag the indices (and rebuild the
ones that can't be defragged). There are 2 bugs I have open tickets on
with indices not being defragged even after rebuilding...and not on
small tables, but large ones with thousands of pages of data. I'll
update my blog once MSFT gives more information on what is going on.
I don't want to belittle, but I have a strong feeling that you still
have a lot to gain by tuning the application. Maybe you've past all
the simple ones: adding indexes, finding bad queries etc, and you
will now have to look for more structural issues. That is, how much
iterative processing (cursors and the like) do you have?

After all, the numbers Stu gave for his system were appalling better
than yours.

Of course, TPC-C benchmark was even further afield, but that is a
value that more demonstrates the outer edge of what is at all possible.

I can't give any numbers for our system, but none of our customers are
close to the load that yours and Stu's system see.

--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Reply With Quote
  #14  
Old   
IanIpp
 
Posts: n/a

Default Re: Advice on middleware products for TRUE Scaling out of SQL Server - 04-19-2006 , 05:33 PM



That's interesting info Stu.

Erland,
Quote:
I don't want to belittle, but I have a strong feeling that you still
have a lot to gain by tuning the application. Maybe you've past all
the simple ones: adding indexes, finding bad queries etc, and you
will now have to look for more structural issues. That is, how much
iterative processing (cursors and the like) do you have?

Virtually no cursors and iterative processing, due to all the problems
with them.

Quote:
After all, the numbers Stu gave for his system were appalling better
than yours.

Erland, it's premature to judge one way or the other, since it's not
necessarily an apples to apples comparison.

Imagine this example. Two systems are both perfectly tuned and the
applications are perfectly designed. One is doing one query over and
over again: SELECT one_field FROM [SimpleTable] and that table is only
a few thousands rows. The second server is doing another query over
and over again: SELECT <lots of fields...> FROM [Table1] INNERJOIN
[Table2] INNERJOIN [Table3] INNERTJOIN [Table4] ... with tables that
each have a few million rows. In this example, system 1 will get a
signficantly higher # of TPS. This doesn't mean that you can jump to
the conclusion that system #2 is out of tune...it just means its job
requirements force it to do more work.

I'll give you a real life example. This is the most heavily used page
on the site (about 60% of the volume of traffic) and thus 60% of the
queries to the database:

http://www.rentacoder.com/RentACoder...ngExpiration=1

That page is actually called from a # of different places (newest bid
requests, my bid requests, search bid requests, browse bid request
category)...all of them lead to that page. But the end result is
always the same thing...show a list of bid requests. It seems simple
until you realize that there are over a million rows in our table of
registered people...and that must be joined to. We have half a million
bid requests and that table must be joined to. Connected each of these
is an average of 50 bids (x half a million=25 million rows) and this
table must be queried to produce some of the summary information. Etc,
etc.

Now maybe Stu's typical transaction is equally demanding. But without
asking him, we can't yet tell. (By the way Stu, do you know your
heaviest volume transaction and what kinds of tables sizes are
involved?)

Some other interesting things. The biggest killer of time on that page
is the fact that it involves paging. This means that:
a) Everyone expects you to provide a feature that say s Page # 1 of
<some #>...meaning you need to know how many total rows are in the
result set...even if you don't return them. So this requires doing a
COUNT (slow).
b) Paging is handled using a a great new SQL Server 2005 feature called
ROW_NUMBER(). That feature shaves off several orders of magnitude of
time versus in 2000 as you can see:
http://sqljunkies.com/WebLog/amachan...1/03/4945.aspx.
Unfortuantely it doesn't work properly on a DISTINCT query (which is
understandable)...which requires structuring it as a ROW_NUMBER() of a
subquery. But there is a bug in 2005...it can do this...but as soon as
you add the BETWEEN clause or WHERE (which is what allows you to save
time via this method) it gives an error and won't run. I'm got a bug
report open with MSFT on this one too (I'm sure they love me...I have
way too many tickets open right now).

Ian

But my point is that back in the SQL 2000 days you didn't have this new
feature...and you just had to put up with the slowness if you were
doing paging.



Reply With Quote
  #15  
Old   
Stu
 
Posts: n/a

Default Re: Advice on middleware products for TRUE Scaling out of SQL Server - 04-19-2006 , 07:30 PM



I'm reluctant to give too many details about our structure for fear
that I'd be compromising our business model; I can say that our primary
application is a data warehouse structure that involves several million
rows of data per day. Data is loaded in small batches (every minute,
24 X7), and we do our own pattern seeking against the new data (kind of
a home-grown analysis services).

Our batches can range in size from 1 row to 20,000 rows, depending on
the time of day, and the nature of the data. We host both a raw
database (involving a very verbose but simple OLTP structure) and the
data warehouse on the same box.

Some of the things we do to lessen the bottleneck on the server
include:

1. distribute the ETL process as much as possible. We have several
little bots that run on various servers that handle loading, analysis,
and grouping off the main server. We used to use DTS quite heavily;
we're transitioning away from that.

2. Use appropriate locking hints. We write all sorts of code involving
NOLOCK and temp tables to prefetch the data.

3. Make the most of our physical structure. We use filegroups to
seperate indexes from tables, and have isolated our busiest databases
from each other on seperate drive arrays.

4. We always cluster on monotonically-increasing values (such as dates
or date representation) so that page splits are minimal. We also use
partitioned views (although they are a bit of a bear to maintain).

HTH,Stu


Reply With Quote
  #16  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Advice on middleware products for TRUE Scaling out of SQL Server - 04-20-2006 , 02:31 AM



IanIpp (ian.ippolito (AT) gmail (DOT) com) writes:
Quote:
I'll give you a real life example. This is the most heavily used page
on the site (about 60% of the volume of traffic) and thus 60% of the
queries to the database:

http://www.rentacoder.com/RentACoder...ngExpiration=1
...
Some other interesting things. The biggest killer of time on that page
is the fact that it involves paging. This means that:
So each time I request the next page the query is rerun? Here is a tip
from a pure end-user perspective. Spit out 100 of entries at time rather
than just 10. I don't know why web designers insist on giving my small piece
at time. Give me at least 100 items at a time. I've better things to do all
day, than paging forth and back in a lousy web browser. On top of that,
if the query is rerun each time I page, I may get to see different results.

From a more technical perspective, saving the search results in a process-
keyed table could be an option, although it means that each initial search
will require a write operation, and if users don't page very often, it
could just make matters worse. (Then again, here is an easy option to
scale out: the middle tier could receive the full result set, and then
write the search to different server.)


--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Reply With Quote
  #17  
Old   
IanIpp
 
Posts: n/a

Default Re: Advice on middleware products for TRUE Scaling out of SQL Server - 04-21-2006 , 02:36 PM



Thanks for the information guys.

Stu, thanks for the info on your DB. The Rent a Coder DB is OLTP,
which definitely presents different challenges than a datawarehousing
system because of the inherent conflicts of optimizing/and locking that
cocur with updates/inserts versus SELECTS (you also do updates/inserts,
but only when refreshing the data) .

-----------------------------------------------------------

By the way, I found some of the answers to my original question:
1) The Metaverse database scattering middleware prodcut is NOT
available yet...still in Alpha. The owner actually recommended the 2nd
product.
2) The PCTI Corp middleware product (ICS-UDS) is available. The owner
is preparing references from several companies...two of which are house
hold names, so that is encouarging.

---------------------------------------------------------------

Here's an unrelated question for anyone:

The built in SQL Server 2005 tools for analyzing performance problems
are very limited when trying to diagnose active problems. Here are a
few tools that I found that make up some of the deficiencies:

1)
http://www.quest.com/quest_central_f...e_analysis.asp
(There is a nice video explaining the problems with the existing SQL
2005 tools here:
http://www.quest.com/Quest_Central_f...r/dba_tale.htm )
2) http://www.sqlpower.com/index.html (Pepsi uses this product)

I'm sure there are many more. Which add on tool do you use and why?

Thanks,
Ian Ippolito
RentACoder.com


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.