![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Greetings, I'm connecting via HTTP to an Analysis server. Whether or not I set "Execution Location=3;Default Isolation Mode=1" in my Connection String, I don't notice any change in performance. If I monitor Web Service:Bytes Sent/sec, it seems that my server (which doesn't host any web pages other than msolap.asp) is pumping out tons of data, regardless of the Execution Location setting. So I suspect that my client is still performing most of the analysis. I have checked to make sure that the client and server have the same Locale. I'm pretty sure they do. Is there any way to troubleshoot the Execution Location setting? Thanks, Jonathan |
#3
| |||||||
| |||||||
|
|
Hi Jonathan, "Jonathan Levine" wrote: Greetings, I'm connecting via HTTP to an Analysis server. Whether or not I set "Execution Location=3;Default Isolation Mode=1" in my Connection String, I don't notice any change in performance. If I monitor Web Service:Bytes Sent/sec, it seems that my server (which doesn't host any web pages other than msolap.asp) is pumping out tons of data, regardless of the Execution Location setting. So I suspect that my client is still performing most of the analysis. First thing to check, if you can, is whether you notice any difference when you connect not using HTTP. Some connection string properties get stripped away when you use an HTTP connection, and I'm sure other complicating factors come into play too. |
|
Once that's out of the way, there are some other scenarios where you can't remote a query to the server apart from when the locales on the two machines are different: - When the version of the server components is older than the version of the client components. Basically, make sure you have SP3a installed on your server and your client. |
|
- When you have ROLAP dimensions in your cube. Since the occasions when ROLAP dimensions are a valid design choice are few and far between, then if you have them you should consider getting rid of them. |
|
- When you use certain other, obscure connection string properties and functionality such as Default MDX Visual Mode and session cubes. You're unlikely to be using these unless your client is Excel, in which case try using a different client and see if there's a difference. |
|
- When you have calculated members that return strings, although this may well have been fixed in a hotfix by now. |
|
- When you use certain other functions in calculated members, such as Lookupcube or CreatePropertySet, or you use UDFs in your calculated members. |
|
That said, even when you can get Execution Location=3; Default Isolation Mode=1 to work, I find it's rarely the answer to performance problems. In fact, in some cases it can make performance worse. If you're having problems in this area (and from your previous posts, I think you are) then you might be better off looking elsewhere, such as double checking the actual aggregations that have been built on your new machine are the same as the ones on your old machine, using the Partition Manager tool; checking to see if the slices on your partitions are correctly set etc. |
#4
| |||
| |||
|
|
Hi, Chris. Thanks for your reply. "Chris Webb" wrote: Hi Jonathan, "Jonathan Levine" wrote: Greetings, I'm connecting via HTTP to an Analysis server. Whether or not I set "Execution Location=3;Default Isolation Mode=1" in my Connection String, I don't notice any change in performance. If I monitor Web Service:Bytes Sent/sec, it seems that my server (which doesn't host any web pages other than msolap.asp) is pumping out tons of data, regardless of the Execution Location setting. So I suspect that my client is still performing most of the analysis. First thing to check, if you can, is whether you notice any difference when you connect not using HTTP. Some connection string properties get stripped away when you use an HTTP connection, and I'm sure other complicating factors come into play too. It's difficult to tell for sure. I've been running similar queries using the MDX sample app in 4 configurations: - No HTTP, no Execution Location or Isolation Level - No HTTP, Execution Location=3;Default Isolation Level=1 - HTTP, no execution Location or Isolation Level - HTTP, Execution Location=3; Default Isolation Level=1 The query times seem similar in all 4 cases (that is, the first time a query is executed, it takes about a minute to return the results and subsequent queries take much less time). The working set of each process seems similar, too: about 13MB when it starts up, then about 75-80 MB after the query is executed. Typical results are about 36 cells -- 60 MB strikes me as a lot of memory for 36 cells if everything is executed on the server. Once that's out of the way, there are some other scenarios where you can't remote a query to the server apart from when the locales on the two machines are different: - When the version of the server components is older than the version of the client components. Basically, make sure you have SP3a installed on your server and your client. HKLM\Software\Microsoft\OLAP Server\CurrentVersion\CSD Version is "Service Pack 3" on both the client and server. - When you have ROLAP dimensions in your cube. Since the occasions when ROLAP dimensions are a valid design choice are few and far between, then if you have them you should consider getting rid of them. I don't have any ROLAP dimensions. The cubes themselves are also MOLAP. - When you use certain other, obscure connection string properties and functionality such as Default MDX Visual Mode and session cubes. You're unlikely to be using these unless your client is Excel, in which case try using a different client and see if there's a difference. So far, I've been connecting with the MDX sample and with VB.NET using ADO (not ADOMD). - When you have calculated members that return strings, although this may well have been fixed in a hotfix by now. Calculated Members return Integer or Big Integer - When you use certain other functions in calculated members, such as Lookupcube or CreatePropertySet, or you use UDFs in your calculated members. My aggregates are pretty simple. Sum and DistinctCount. That said, even when you can get Execution Location=3; Default Isolation Mode=1 to work, I find it's rarely the answer to performance problems. In fact, in some cases it can make performance worse. If you're having problems in this area (and from your previous posts, I think you are) then you might be better off looking elsewhere, such as double checking the actual aggregations that have been built on your new machine are the same as the ones on your old machine, using the Partition Manager tool; checking to see if the slices on your partitions are correctly set etc. I've been investigating a little with Partition Manager, and you may ultimately be right about this... would love your feedback on things to check. However, given the Performance monitor stats on the amount of data sent out by the web server, I am pretty sure that the time it takes to pump data between the machines is at least one issue. Regards, Jonathan |
#5
| ||||||
| ||||||
|
|
Hi Jonathan, One further scenario to try with your queries is running them directly on the server - what happens then? That said, I'm feeling more and more that remoting is not going to be the answer you're looking for. |
|
A question: when you say that your aggregates are simple, either sum or distinct count, does that mean you're using the DistinctCount MDX function in calculated members, or do you have measures who have > the aggregation type Distinct Count. If the former, can you tell me a bit more about what you're doing with these calculations? If the latter, have you made sure that each Distinct Count measure is in a cube on its own, with no other measures? |
|
Do you have partitions in your cube? Yes. In each of the 3 cubes, I have one partition for "historical data" and one for "current data". "Current data" is from the beginning of May, 2004. Historical data is from January 2003 - April 2004. |
|
Regarding Partition Manager, it's useful for understanding the levels at which each aggregation is built at - the Analysis Services Performance Guide, which I think you've read, explains this very well. It would be a good idea to look at your old server (the one without the performance problems) and the new one and check to see that the same aggregations exist on both. |
|
Additionally, while you run your test queries you should have a look at the Query Number, DSN Requested and the DSN Used counters in Perfmon on your server. Query Number will tell you have many sub- queries each query is being broken up into - possibly hundreds or thousands for a single query. DSN Requested and DSN Used are lists of numbers giving the ordinal of the level on each dimension that each subquery is requesting data from. To explain this a bit further, imagine you had a cube with two dimensions apart from Measures - Time, with three levels (Year, Quarter, Month) and Product, with four levels (All, Category, SubCategory, Product Name), and which appear in that order in the treeview in the Cube Editor. You run a query, and it results in one subquery on the server, and DSN Requested returns the value 21, meaning that it requested data at level 2 from Time (ie Quarter) and level 1 from Product (ie All). DSN Used, however, returns the value 33 - which means that the data itself was read from an aggregation built at the Month/SubCategory levels, and the values requested had to be aggregated from here. It follows if you had built an aggregation at Quarter/All Product then your query would have been directly answered from that, and DSN Used would have returned 21 too; and of course, performance would have been better. What do you see happening on your server? |
|
"Jonathan Levine" wrote: Hi, Chris. Thanks for your reply. "Chris Webb" wrote: Hi Jonathan, "Jonathan Levine" wrote: Greetings, I'm connecting via HTTP to an Analysis server. Whether or not I set "Execution Location=3;Default Isolation Mode=1" in my Connection String, I don't notice any change in performance. If I monitor Web Service:Bytes Sent/sec, it seems that my server (which doesn't host any web pages other than msolap.asp) is pumping out tons of data, regardless of the Execution Location setting. So I suspect that my client is still performing most of the analysis. First thing to check, if you can, is whether you notice any difference when you connect not using HTTP. Some connection string properties get stripped away when you use an HTTP connection, and I'm sure other complicating factors come into play too. It's difficult to tell for sure. I've been running similar queries using the MDX sample app in 4 configurations: - No HTTP, no Execution Location or Isolation Level - No HTTP, Execution Location=3;Default Isolation Level=1 - HTTP, no execution Location or Isolation Level - HTTP, Execution Location=3; Default Isolation Level=1 The query times seem similar in all 4 cases (that is, the first time a query is executed, it takes about a minute to return the results and subsequent queries take much less time). The working set of each process seems similar, too: about 13MB when it starts up, then about 75-80 MB after the query is executed. Typical results are about 36 cells -- 60 MB strikes me as a lot of memory for 36 cells if everything is executed on the server. Once that's out of the way, there are some other scenarios where you can't remote a query to the server apart from when the locales on the two machines are different: - When the version of the server components is older than the version of the client components. Basically, make sure you have SP3a installed on your server and your client. HKLM\Software\Microsoft\OLAP Server\CurrentVersion\CSD Version is "Service Pack 3" on both the client and server. - When you have ROLAP dimensions in your cube. Since the occasions when ROLAP dimensions are a valid design choice are few and far between, then if you have them you should consider getting rid of them. I don't have any ROLAP dimensions. The cubes themselves are also MOLAP. - When you use certain other, obscure connection string properties and functionality such as Default MDX Visual Mode and session cubes. You're unlikely to be using these unless your client is Excel, in which case try using a different client and see if there's a difference. So far, I've been connecting with the MDX sample and with VB.NET using ADO (not ADOMD). - When you have calculated members that return strings, although this may well have been fixed in a hotfix by now. Calculated Members return Integer or Big Integer - When you use certain other functions in calculated members, such as Lookupcube or CreatePropertySet, or you use UDFs in your calculated members. My aggregates are pretty simple. Sum and DistinctCount. That said, even when you can get Execution Location=3; Default Isolation Mode=1 to work, I find it's rarely the answer to performance problems. In fact, in some cases it can make performance worse. If you're having problems in this area (and from your previous posts, I think you are) then you might be better off looking elsewhere, such as double checking the actual aggregations that have been built on your new machine are the same as the ones on your old machine, using the Partition Manager tool; checking to see if the slices on your partitions are correctly set etc. I've been investigating a little with Partition Manager, and you may ultimately be right about this... would love your feedback on things to check. However, given the Performance monitor stats on the amount of data sent out by the web server, I am pretty sure that the time it takes to pump data between the machines is at least one issue. Regards, Jonathan |
#6
| |||
| |||
|
|
Hi Chris, Thanks very much for your detailed email. "Chris Webb" wrote: Hi Jonathan, One further scenario to try with your queries is running them directly on the server - what happens then? That said, I'm feeling more and more that remoting is not going to be the answer you're looking for. It's much faster on the server. Lately, I've been noticing a significant difference between connecting via HTTP (much slower) and connecting directly (faster, but still noticably slower than the server). Here's something interesting: The query uses a range into a date dimension. On the remote client, If I run the query once via HTTP, then I change the date range and run the query again, it's almost instantaneous the second time. On the other hand, if I run the same test via direct connection, the query time is almost the same the first time or the second time. A question: when you say that your aggregates are simple, either sum or distinct count, does that mean you're using the DistinctCount MDX function in calculated members, or do you have measures who have > the aggregation type Distinct Count. If the former, can you tell me a bit more about what you're doing with these calculations? If the latter, have you made sure that each Distinct Count measure is in a cube on its own, with no other measures? It is the latter, with each Distinct Count measure in its own cube, and the various cubes combined in a Virtual cube. Yes, the Distinct Count cubes don't have any other measures. Do you have partitions in your cube? Yes. In each of the 3 cubes, I have one partition for "historical data" and one for "current data". "Current data" is from the beginning of May, 2004. Historical data is from January 2003 - April 2004. Regarding Partition Manager, it's useful for understanding the levels at which each aggregation is built at - the Analysis Services Performance Guide, which I think you've read, explains this very well. It would be a good idea to look at your old server (the one without the performance problems) and the new one and check to see that the same aggregations exist on both. The aggregations do look substantially the same. I modified them to be exactly the same and didn't see any performance difference. One thing I noticed is that one of the dimensions doesn't appear at all in the Partition Manager. Specifically, the axis that I think would benefit the most from aggregation (also the largest axis) doesn't show up. Additionally, while you run your test queries you should have a look at the Query Number, DSN Requested and the DSN Used counters in Perfmon on your server. Query Number will tell you have many sub- queries each query is being broken up into - possibly hundreds or thousands for a single query. DSN Requested and DSN Used are lists of numbers giving the ordinal of the level on each dimension that each subquery is requesting data from. To explain this a bit further, imagine you had a cube with two dimensions apart from Measures - Time, with three levels (Year, Quarter, Month) and Product, with four levels (All, Category, SubCategory, Product Name), and which appear in that order in the treeview in the Cube Editor. You run a query, and it results in one subquery on the server, and DSN Requested returns the value 21, meaning that it requested data at level 2 from Time (ie Quarter) and level 1 from Product (ie All). DSN Used, however, returns the value 33 - which means that the data itself was read from an aggregation built at the Month/SubCategory levels, and the values requested had to be aggregated from here. It follows if you had built an aggregation at Quarter/All Product then your query would have been directly answered from that, and DSN Used would have returned 21 too; and of course, performance would have been better. What do you see happening on your server? On my server, I'm seeing query number increment by about 15 per query, with DSN Reqested = 511122 and DNS Used = 546223 Is there some way to understand which level maps to which digit in the DSN Requested/Used perfmon? -- Jonathan "Jonathan Levine" wrote: Hi, Chris. Thanks for your reply. "Chris Webb" wrote: Hi Jonathan, "Jonathan Levine" wrote: Greetings, I'm connecting via HTTP to an Analysis server. Whether or not I set "Execution Location=3;Default Isolation Mode=1" in my Connection String, I don't notice any change in performance. If I monitor Web Service:Bytes Sent/sec, it seems that my server (which doesn't host any web pages other than msolap.asp) is pumping out tons of data, regardless of the Execution Location setting. So I suspect that my client is still performing most of the analysis. First thing to check, if you can, is whether you notice any difference when you connect not using HTTP. Some connection string properties get stripped away when you use an HTTP connection, and I'm sure other complicating factors come into play too. It's difficult to tell for sure. I've been running similar queries using the MDX sample app in 4 configurations: - No HTTP, no Execution Location or Isolation Level - No HTTP, Execution Location=3;Default Isolation Level=1 - HTTP, no execution Location or Isolation Level - HTTP, Execution Location=3; Default Isolation Level=1 The query times seem similar in all 4 cases (that is, the first time a query is executed, it takes about a minute to return the results and subsequent queries take much less time). The working set of each process seems similar, too: about 13MB when it starts up, then about 75-80 MB after the query is executed. Typical results are about 36 cells -- 60 MB strikes me as a lot of memory for 36 cells if everything is executed on the server. Once that's out of the way, there are some other scenarios where you can't remote a query to the server apart from when the locales on the two machines are different: - When the version of the server components is older than the version of the client components. Basically, make sure you have SP3a installed on your server and your client. HKLM\Software\Microsoft\OLAP Server\CurrentVersion\CSD Version is "Service Pack 3" on both the client and server. - When you have ROLAP dimensions in your cube. Since the occasions when ROLAP dimensions are a valid design choice are few and far between, then if you have them you should consider getting rid of them. I don't have any ROLAP dimensions. The cubes themselves are also MOLAP. - When you use certain other, obscure connection string properties and functionality such as Default MDX Visual Mode and session cubes. You're unlikely to be using these unless your client is Excel, in which case try using a different client and see if there's a difference. So far, I've been connecting with the MDX sample and with VB.NET using ADO (not ADOMD). - When you have calculated members that return strings, although this may well have been fixed in a hotfix by now. Calculated Members return Integer or Big Integer - When you use certain other functions in calculated members, such as Lookupcube or CreatePropertySet, or you use UDFs in your calculated members. My aggregates are pretty simple. Sum and DistinctCount. That said, even when you can get Execution Location=3; Default Isolation Mode=1 to work, I find it's rarely the answer to performance problems. In fact, in some cases it can make performance worse. If you're having problems in this area (and from your previous posts, I think you are) then you might be better off looking elsewhere, such as double checking the actual aggregations that have been built on your new machine are the same as the ones on your old machine, using the Partition Manager tool; checking to see if the slices on your partitions are correctly set etc. I've been investigating a little with Partition Manager, and you may ultimately be right about this... would love your feedback on things to check. However, given the Performance monitor stats on the amount of data sent out by the web server, I am pretty sure that the time it takes to pump data between the machines is at least one issue. Regards, Jonathan |
#7
| |||
| |||
|
|
I wonder if I'm getting bitten by Bug #: 14027 (Plato 7x) "HTTP Connections to Analysis Services Server Computers Running Windows Server 2003 Are Slow" (KB article 822652). If so, how do I go about getting the hotfix? |
#8
| |||
| |||
|
|
"Jonathan Levine" wrote: I wonder if I'm getting bitten by Bug #: 14027 (Plato 7x) "HTTP Connections to Analysis Services Server Computers Running Windows Server 2003 Are Slow" (KB article 822652). If so, how do I go about getting the hotfix? The hotfix did the trick. Thanks again for your help, Chris. -- J |
#9
| |||
| |||
|
|
First, partitions. You mention that you have a partition for the current month and a partition for everything else - if so, what are you setting as the slice value for this historical partition, or have you left it blank? It's very important to set the slice value for a partition to get the maximum performance gain, so I would recommend instead having a partition for each month and setting the slice value appropriately. |
|
Secondly, regarding aggregations, you asked how to decode the DSN Requested and DSN Used values. The first digit in these numbers refers to the first dimension in the treeview in the cube editor; the second to the second in the treeview etc. Now, you said that each query you run produces around 15 subqueries; each of these could produce a different figure for DSN Requested and DSN Used, and I assume the values you gave me were the ones for the last subquery. However, these values are very interesting: 511122 shows you're requesting data at the top levels of all but the first dimension, and 546223 shows it's having to take data from a much lower level (possibly the leaf level of all dimensions?) to get these values meaning that performance is going to be much worse than it should. Because of >the way that AS designs aggregations, my guess is that you're not hitting any aggregations with these queries and therefore not getting any performance gain from them - it's very unlikely any dimensions would have been designed at level 5 of this first dimension (see the AS Performance Guide to find out more about why this is - the one third rule etc) even with Usage Based Optimisation. Can you tell me a bit more about this first dimension? What I would recommend is using Partition Manager to build an aggregation aimed specifically at your query, ie at 511122 and see how much of an improvement it makes in performance - you might be surprised at the result. |
|
One last thing - can you also tell me some more about the dimension that doesn't appear in Partition Manager? |
#10
| |||
| |||
|
|
Chris, Thanks for your latest post. "Chris Webb" wrote: First, partitions. You mention that you have a partition for the current month and a partition for everything else - if so, what are you setting as the slice value for this historical partition, or have you left it blank? It's very important to set the slice value for a partition to get the maximum performance gain, so I would recommend instead having a partition for each month and setting the slice value appropriately. Sorry for my ignorance, but by "slice value," I'm assuming you mean the WHERE clause in Partition Wizard/Advanced...? If so, it is reg_date >= '2004-5-1' in the "current" partition and reg_date < '2004-5-1' in the "historical" partition. I think this is OK -- 99.9% of the queries are on the "current" partition. Secondly, regarding aggregations, you asked how to decode the DSN Requested and DSN Used values. The first digit in these numbers refers to the first dimension in the treeview in the cube editor; the second to the second in the treeview etc. Now, you said that each query you run produces around 15 subqueries; each of these could produce a different figure for DSN Requested and DSN Used, and I assume the values you gave me were the ones for the last subquery. However, these values are very interesting: 511122 shows you're requesting data at the top levels of all but the first dimension, and 546223 shows it's having to take data from a much lower level (possibly the leaf level of all dimensions?) to get these values meaning that performance is going to be much worse than it should. Because of >the way that AS designs aggregations, my guess is that you're not hitting any aggregations with these queries and therefore not getting any performance gain from them - it's very unlikely any dimensions would have been designed at level 5 of this first dimension (see the AS Performance Guide to find out more about why this is - the one third rule etc) even with Usage Based Optimisation. Can you tell me a bit more about this first dimension? What I would recommend is using Partition Manager to build an aggregation aimed specifically at your query, ie at 511122 and see how much of an improvement it makes in performance - you might be surprised at the result. One last thing - can you also tell me some more about the dimension that doesn't appear in Partition Manager? I've gone through and removed all the dimensions that are not used by my specific queries, and the missing dimension is back. I think the problem was that I had (incorrectly) specified that the missing dimension was dependant on another dimension, since deleted. The DSN requested vs DSN used has also improved: requested = 5223 vs used 5233. And the queries are now really blazingly fast. What has taken about an hour before the hotfix is down to about 2 minutes. Not sure how much is due to the hotfix and how much due to reaggregation and removal of unused dimensions. Regards, Jonathan |
![]() |
| Thread Tools | |
| Display Modes | |
| |