![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hello, I am having some pretty major performance issues with a cube I've built. My goal for this post is to give specifics related to the data and Analysis Services hardware to try and get some feedback from the community on whether or not I am having valid expectations performance-wise. Hardware: -Two dual-core Xeon 2.8GHz processors -3GB RAM Data: The data is retail transaction data which is at the line item level. The most recent years have 50-60 million rows and I am partitioning by year. In addition, each transaction has related discount and tender (payment) data. The tender data has approximately 15-20 million rows per year while the discount data has approximately 10-15 million rows per year. In my cube, each of these tables have their own separate fact group. I am pulling data from all three fact groups to create quite a few calculated members which build on each other to finally arrive at calculations such as GAAP Sales, sales for a particular product type, etc. I believe I have my hierarchies and relationships set up correctly in my dimensions. In addition I have aggregations designed on each partition at about the 40% level. Am I having reasonable expectations that data should be able to come back fairly quickly? I have some reports which take 10-15 minutes that bring back quite a lot of data for the whole company across the last two years. However, even some simple slicing and dicing in the cube browser can be slow. What is interesting to me is that even if I drop my time dimension on the rows axis before adding any measures or other dimensions, it can take 20 seconds or more to respond. All the while when reports are running or when I am browsing the cube myself, the CPU (all 8 in Task Manager [dual dual-core processors]) are spiking at 100%. While I believe I can get some good responses from MS folks and MVPs for my own benefit, I think any thoughtful posts will benefit the community as a whole by setting performance expectations and pointing out potential pitfalls. I have found that there are not a ton of good resources out there on performance, so I appreciate any tips anyone can offer. Thanks! Todd |
#3
| |||
| |||
|
|
do you use AS2000 or 2005? do you use x64 or x32 server? have you used the usage based optimization wizard to add aggregations? do you have calculated members? calculated sets? We have a server with 16Gb, Windows x64, SQL 2005 x64, 2 * Dual Core Opteron for the moment only 2 years of history, in all my partitions in all cubes I have 250 millions of rows by year (biggest fact contain 50 millions / year, and the cube in front of this fact provide standard SUMS and DCount!) the response time is excellent, the first access is allways slow (cold cache), but I can drill from years to days in seconds, and add filters and the response time still good. I'm playing with 2 dcount measures + a ratio between these 2 measures + ratio between 2 sums. The memory is shared between SQL, AS and reporting services on this server, and AS don't use more then 2Gb today (but when we'll be in production with more historical data the memory usage will be higher) in your case, try to disable the hyperthreading, sometimes this feature produce bad results. "Todd" <toddkitta (AT) gmail (DOT) com> wrote in message news:1153342986.762069.317490 (AT) p79g2000cwp (DOT) googlegroups.com... Hello, I am having some pretty major performance issues with a cube I've built. My goal for this post is to give specifics related to the data and Analysis Services hardware to try and get some feedback from the community on whether or not I am having valid expectations performance-wise. Hardware: -Two dual-core Xeon 2.8GHz processors -3GB RAM Data: The data is retail transaction data which is at the line item level. The most recent years have 50-60 million rows and I am partitioning by year. In addition, each transaction has related discount and tender (payment) data. The tender data has approximately 15-20 million rows per year while the discount data has approximately 10-15 million rows per year. In my cube, each of these tables have their own separate fact group. I am pulling data from all three fact groups to create quite a few calculated members which build on each other to finally arrive at calculations such as GAAP Sales, sales for a particular product type, etc. I believe I have my hierarchies and relationships set up correctly in my dimensions. In addition I have aggregations designed on each partition at about the 40% level. Am I having reasonable expectations that data should be able to come back fairly quickly? I have some reports which take 10-15 minutes that bring back quite a lot of data for the whole company across the last two years. However, even some simple slicing and dicing in the cube browser can be slow. What is interesting to me is that even if I drop my time dimension on the rows axis before adding any measures or other dimensions, it can take 20 seconds or more to respond. All the while when reports are running or when I am browsing the cube myself, the CPU (all 8 in Task Manager [dual dual-core processors]) are spiking at 100%. While I believe I can get some good responses from MS folks and MVPs for my own benefit, I think any thoughtful posts will benefit the community as a whole by setting performance expectations and pointing out potential pitfalls. I have found that there are not a ton of good resources out there on performance, so I appreciate any tips anyone can offer. Thanks! Todd |
#4
| |||
| |||
|
|
Jeje, First, thanks for your response. I always see your activity on the boards! I should have mentioned most of what you asked... I am using AS2005; x32; currently I am using Usage Based Optimizations, but I am not getting much better results than I had with the standard aggregation wizard. And I have a ton of calculated members, and a lot of them build on each other. I have not heard about disabling the hyperthreading. What are the issues here? Could it make a drastic difference? Any other ideas? Thanks again for your response. Jeje wrote: do you use AS2000 or 2005? do you use x64 or x32 server? have you used the usage based optimization wizard to add aggregations? do you have calculated members? calculated sets? We have a server with 16Gb, Windows x64, SQL 2005 x64, 2 * Dual Core Opteron for the moment only 2 years of history, in all my partitions in all cubes I have 250 millions of rows by year (biggest fact contain 50 millions / year, and the cube in front of this fact provide standard SUMS and DCount!) the response time is excellent, the first access is allways slow (cold cache), but I can drill from years to days in seconds, and add filters and the response time still good. I'm playing with 2 dcount measures + a ratio between these 2 measures + ratio between 2 sums. The memory is shared between SQL, AS and reporting services on this server, and AS don't use more then 2Gb today (but when we'll be in production with more historical data the memory usage will be higher) in your case, try to disable the hyperthreading, sometimes this feature produce bad results. "Todd" <toddkitta (AT) gmail (DOT) com> wrote in message news:1153342986.762069.317490 (AT) p79g2000cwp (DOT) googlegroups.com... Hello, I am having some pretty major performance issues with a cube I've built. My goal for this post is to give specifics related to the data and Analysis Services hardware to try and get some feedback from the community on whether or not I am having valid expectations performance-wise. Hardware: -Two dual-core Xeon 2.8GHz processors -3GB RAM Data: The data is retail transaction data which is at the line item level. The most recent years have 50-60 million rows and I am partitioning by year. In addition, each transaction has related discount and tender (payment) data. The tender data has approximately 15-20 million rows per year while the discount data has approximately 10-15 million rows per year. In my cube, each of these tables have their own separate fact group. I am pulling data from all three fact groups to create quite a few calculated members which build on each other to finally arrive at calculations such as GAAP Sales, sales for a particular product type, etc. I believe I have my hierarchies and relationships set up correctly in my dimensions. In addition I have aggregations designed on each partition at about the 40% level. Am I having reasonable expectations that data should be able to come back fairly quickly? I have some reports which take 10-15 minutes that bring back quite a lot of data for the whole company across the last two years. However, even some simple slicing and dicing in the cube browser can be slow. What is interesting to me is that even if I drop my time dimension on the rows axis before adding any measures or other dimensions, it can take 20 seconds or more to respond. All the while when reports are running or when I am browsing the cube myself, the CPU (all 8 in Task Manager [dual dual-core processors]) are spiking at 100%. While I believe I can get some good responses from MS folks and MVPs for my own benefit, I think any thoughtful posts will benefit the community as a whole by setting performance expectations and pointing out potential pitfalls. I have found that there are not a ton of good resources out there on performance, so I appreciate any tips anyone can offer. Thanks! Todd |
#5
| |||
| |||
|
|
I have read articles in the past year describe possible performance issues with hyperthreading with SQL Server. I don't know if this could impact AS. maybe what's appends: AS think there is 8 CPUs and optimize himself to use these 8 CPU this will cause bottleneck because there is too many threads and not enough real CPU. this cause a lot of waits. Hyperthreading is good when you run different applications which use different "part" of the CPU, I mean floating calculation and integer calculation at the same time. the performance advantage is really small for a single application. what is your frontend application? can you send your connectionstring? if you force the execution on the server side (execution location=3) sometimes this greatly decrease the performance!!! in my case, I have found that for dcount dynamic calculation the execution location on the server cause a slow performance, a query takes 2 minutes instead-of 20 seconds (AS2000 enterprise) but its not always true, and standard "sums" and "count" measures perform better on the server side. take a look at the query number in the performance monitor. when you execute a query and you see this number to raise quickly, its because you have recursive calculations or non empty crossjoin calculations which cause the server to take a long time to calculate the result. a standard access should execute between 1 and 4 queries (query number) using the pivottable. "Todd" <toddkitta (AT) gmail (DOT) com> wrote in message news:1153354436.542320.103860 (AT) m79g2000cwm (DOT) googlegroups.com... Jeje, First, thanks for your response. I always see your activity on the boards! I should have mentioned most of what you asked... I am using AS2005; x32; currently I am using Usage Based Optimizations, but I am not getting much better results than I had with the standard aggregation wizard. And I have a ton of calculated members, and a lot of them build on each other. I have not heard about disabling the hyperthreading. What are the issues here? Could it make a drastic difference? Any other ideas? Thanks again for your response. Jeje wrote: do you use AS2000 or 2005? do you use x64 or x32 server? have you used the usage based optimization wizard to add aggregations? do you have calculated members? calculated sets? We have a server with 16Gb, Windows x64, SQL 2005 x64, 2 * Dual Core Opteron for the moment only 2 years of history, in all my partitions in all cubes I have 250 millions of rows by year (biggest fact contain 50 millions / year, and the cube in front of this fact provide standard SUMS and DCount!) the response time is excellent, the first access is allways slow (cold cache), but I can drill from years to days in seconds, and add filters and the response time still good. I'm playing with 2 dcount measures + a ratio between these 2 measures + ratio between 2 sums. The memory is shared between SQL, AS and reporting services on this server, and AS don't use more then 2Gb today (but when we'll be in production with more historical data the memory usage will be higher) in your case, try to disable the hyperthreading, sometimes this feature produce bad results. "Todd" <toddkitta (AT) gmail (DOT) com> wrote in message news:1153342986.762069.317490 (AT) p79g2000cwp (DOT) googlegroups.com... Hello, I am having some pretty major performance issues with a cube I've built. My goal for this post is to give specifics related to the data and Analysis Services hardware to try and get some feedback from the community on whether or not I am having valid expectations performance-wise. Hardware: -Two dual-core Xeon 2.8GHz processors -3GB RAM Data: The data is retail transaction data which is at the line item level. The most recent years have 50-60 million rows and I am partitioning by year. In addition, each transaction has related discount and tender (payment) data. The tender data has approximately 15-20 million rows per year while the discount data has approximately 10-15 million rows per year. In my cube, each of these tables have their own separate fact group. I am pulling data from all three fact groups to create quite a few calculated members which build on each other to finally arrive at calculations such as GAAP Sales, sales for a particular product type, etc. I believe I have my hierarchies and relationships set up correctly in my dimensions. In addition I have aggregations designed on each partition at about the 40% level. Am I having reasonable expectations that data should be able to come back fairly quickly? I have some reports which take 10-15 minutes that bring back quite a lot of data for the whole company across the last two years. However, even some simple slicing and dicing in the cube browser can be slow. What is interesting to me is that even if I drop my time dimension on the rows axis before adding any measures or other dimensions, it can take 20 seconds or more to respond. All the while when reports are running or when I am browsing the cube myself, the CPU (all 8 in Task Manager [dual dual-core processors]) are spiking at 100%. While I believe I can get some good responses from MS folks and MVPs for my own benefit, I think any thoughtful posts will benefit the community as a whole by setting performance expectations and pointing out potential pitfalls. I have found that there are not a ton of good resources out there on performance, so I appreciate any tips anyone can offer. Thanks! Todd |
#6
| |||
| |||
|
#7
| |||
| |||
|
|
Not sure if this is helpful or not; but have you applied SP1 for AS? it has seemed to be faster for me I have also heard that parent child dimensions have a significant impact on perf and that the upcoming sp 2 should have some more performance fixes. Another thing that I have heard, but not tested is to have multiple cubes instead of one big cube with all of you measure groups. I think this technique is only benefical when you have a lot of non conformed dimensions or query that access measures from single measure groups "Jeje" wrote: I have read articles in the past year describe possible performance issues with hyperthreading with SQL Server. I don't know if this could impact AS. maybe what's appends: AS think there is 8 CPUs and optimize himself to use these 8 CPU this will cause bottleneck because there is too many threads and not enough real CPU. this cause a lot of waits. Hyperthreading is good when you run different applications which use different "part" of the CPU, I mean floating calculation and integer calculation at the same time. the performance advantage is really small for a single application. what is your frontend application? can you send your connectionstring? if you force the execution on the server side (execution location=3) sometimes this greatly decrease the performance!!! in my case, I have found that for dcount dynamic calculation the execution location on the server cause a slow performance, a query takes 2 minutes instead-of 20 seconds (AS2000 enterprise) but its not always true, and standard "sums" and "count" measures perform better on the server side. take a look at the query number in the performance monitor. when you execute a query and you see this number to raise quickly, its because you have recursive calculations or non empty crossjoin calculations which cause the server to take a long time to calculate the result. a standard access should execute between 1 and 4 queries (query number) using the pivottable. "Todd" <toddkitta (AT) gmail (DOT) com> wrote in message news:1153354436.542320.103860 (AT) m79g2000cwm (DOT) googlegroups.com... Jeje, First, thanks for your response. I always see your activity on the boards! I should have mentioned most of what you asked... I am using AS2005; x32; currently I am using Usage Based Optimizations, but I am not getting much better results than I had with the standard aggregation wizard. And I have a ton of calculated members, and a lot of them build on each other. I have not heard about disabling the hyperthreading. What are the issues here? Could it make a drastic difference? Any other ideas? Thanks again for your response. Jeje wrote: do you use AS2000 or 2005? do you use x64 or x32 server? have you used the usage based optimization wizard to add aggregations? do you have calculated members? calculated sets? We have a server with 16Gb, Windows x64, SQL 2005 x64, 2 * Dual Core Opteron for the moment only 2 years of history, in all my partitions in all cubes I have 250 millions of rows by year (biggest fact contain 50 millions / year, and the cube in front of this fact provide standard SUMS and DCount!) the response time is excellent, the first access is allways slow (cold cache), but I can drill from years to days in seconds, and add filters and the response time still good. I'm playing with 2 dcount measures + a ratio between these 2 measures + ratio between 2 sums. The memory is shared between SQL, AS and reporting services on this server, and AS don't use more then 2Gb today (but when we'll be in production with more historical data the memory usage will be higher) in your case, try to disable the hyperthreading, sometimes this feature produce bad results. "Todd" <toddkitta (AT) gmail (DOT) com> wrote in message news:1153342986.762069.317490 (AT) p79g2000cwp (DOT) googlegroups.com... Hello, I am having some pretty major performance issues with a cube I've built. My goal for this post is to give specifics related to the data and Analysis Services hardware to try and get some feedback from the community on whether or not I am having valid expectations performance-wise. Hardware: -Two dual-core Xeon 2.8GHz processors -3GB RAM Data: The data is retail transaction data which is at the line item level. The most recent years have 50-60 million rows and I am partitioning by year. In addition, each transaction has related discount and tender (payment) data. The tender data has approximately 15-20 million rows per year while the discount data has approximately 10-15 million rows per year. In my cube, each of these tables have their own separate fact group. I am pulling data from all three fact groups to create quite a few calculated members which build on each other to finally arrive at calculations such as GAAP Sales, sales for a particular product type, etc. I believe I have my hierarchies and relationships set up correctly in my dimensions. In addition I have aggregations designed on each partition at about the 40% level. Am I having reasonable expectations that data should be able to come back fairly quickly? I have some reports which take 10-15 minutes that bring back quite a lot of data for the whole company across the last two years. However, even some simple slicing and dicing in the cube browser can be slow. What is interesting to me is that even if I drop my time dimension on the rows axis before adding any measures or other dimensions, it can take 20 seconds or more to respond. All the while when reports are running or when I am browsing the cube myself, the CPU (all 8 in Task Manager [dual dual-core processors]) are spiking at 100%. While I believe I can get some good responses from MS folks and MVPs for my own benefit, I think any thoughtful posts will benefit the community as a whole by setting performance expectations and pointing out potential pitfalls. I have found that there are not a ton of good resources out there on performance, so I appreciate any tips anyone can offer. Thanks! Todd |
#8
| |||
| |||
|
|
BigJim, SP1 is applied on the server. In addition, my dimensional model is a star schema; I do not currently have any parent-child dimensions. BK, All of my partitions are MOLAP. Does the fact that even dropping a time hierarchy on the rows access of an empty browser causes the CPUs to spike for an extended period make sense to anyone? That is just strange to me and if that is a symptom of some other issue, maybe it's at the root of my problems. I appreciate every one's feedback and questions so far! BigJimSlade wrote: Not sure if this is helpful or not; but have you applied SP1 for AS? it has seemed to be faster for me I have also heard that parent child dimensions have a significant impact on perf and that the upcoming sp 2 should have some more performance fixes. Another thing that I have heard, but not tested is to have multiple cubes instead of one big cube with all of you measure groups. I think this technique is only benefical when you have a lot of non conformed dimensions or query that access measures from single measure groups "Jeje" wrote: I have read articles in the past year describe possible performance issues with hyperthreading with SQL Server. I don't know if this could impact AS. maybe what's appends: AS think there is 8 CPUs and optimize himself to use these 8 CPU this will cause bottleneck because there is too many threads and not enough real CPU. this cause a lot of waits. Hyperthreading is good when you run different applications which use different "part" of the CPU, I mean floating calculation and integer calculation at the same time. the performance advantage is really small for a single application. what is your frontend application? can you send your connectionstring? if you force the execution on the server side (execution location=3) sometimes this greatly decrease the performance!!! in my case, I have found that for dcount dynamic calculation the execution location on the server cause a slow performance, a query takes 2 minutes instead-of 20 seconds (AS2000 enterprise) but its not always true, and standard "sums" and "count" measures perform better on the server side. take a look at the query number in the performance monitor. when you execute a query and you see this number to raise quickly, its because you have recursive calculations or non empty crossjoin calculations which cause the server to take a long time to calculate the result. a standard access should execute between 1 and 4 queries (query number) using the pivottable. "Todd" <toddkitta (AT) gmail (DOT) com> wrote in message news:1153354436.542320.103860 (AT) m79g2000cwm (DOT) googlegroups.com... Jeje, First, thanks for your response. I always see your activity on the boards! I should have mentioned most of what you asked... I am using AS2005; x32; currently I am using Usage Based Optimizations, but I am not getting much better results than I had with the standard aggregation wizard. And I have a ton of calculated members, and a lot of them build on each other. I have not heard about disabling the hyperthreading. What are the issues here? Could it make a drastic difference? Any other ideas? Thanks again for your response. Jeje wrote: do you use AS2000 or 2005? do you use x64 or x32 server? have you used the usage based optimization wizard to add aggregations? do you have calculated members? calculated sets? We have a server with 16Gb, Windows x64, SQL 2005 x64, 2 * Dual Core Opteron for the moment only 2 years of history, in all my partitions in all cubes I have 250 millions of rows by year (biggest fact contain 50 millions / year, and the cube in front of this fact provide standard SUMS and DCount!) the response time is excellent, the first access is allways slow (cold cache), but I can drill from years to days in seconds, and add filters and the response time still good. I'm playing with 2 dcount measures + a ratio between these 2 measures + ratio between 2 sums. The memory is shared between SQL, AS and reporting services on this server, and AS don't use more then 2Gb today (but when we'll be in production with more historical data the memory usage will be higher) in your case, try to disable the hyperthreading, sometimes this feature produce bad results. "Todd" <toddkitta (AT) gmail (DOT) com> wrote in message news:1153342986.762069.317490 (AT) p79g2000cwp (DOT) googlegroups.com... Hello, I am having some pretty major performance issues with a cube I've built. My goal for this post is to give specifics related to the data and Analysis Services hardware to try and get some feedback from the community on whether or not I am having valid expectations performance-wise. Hardware: -Two dual-core Xeon 2.8GHz processors -3GB RAM Data: The data is retail transaction data which is at the line item level. The most recent years have 50-60 million rows and I am partitioning by year. In addition, each transaction has related discount and tender (payment) data. The tender data has approximately 15-20 million rows per year while the discount data has approximately 10-15 million rows per year. In my cube, each of these tables have their own separate fact group. I am pulling data from all three fact groups to create quite a few calculated members which build on each other to finally arrive at calculations such as GAAP Sales, sales for a particular product type, etc. I believe I have my hierarchies and relationships set up correctly in my dimensions. In addition I have aggregations designed on each partition at about the 40% level. Am I having reasonable expectations that data should be able to come back fairly quickly? I have some reports which take 10-15 minutes that bring back quite a lot of data for the whole company across the last two years. However, even some simple slicing and dicing in the cube browser can be slow. What is interesting to me is that even if I drop my time dimension on the rows axis before adding any measures or other dimensions, it can take 20 seconds or more to respond. All the while when reports are running or when I am browsing the cube myself, the CPU (all 8 in Task Manager [dual dual-core processors]) are spiking at 100%. While I believe I can get some good responses from MS folks and MVPs for my own benefit, I think any thoughtful posts will benefit the community as a whole by setting performance expectations and pointing out potential pitfalls. I have found that there are not a ton of good resources out there on performance, so I appreciate any tips anyone can offer. Thanks! Todd |
![]() |
| Thread Tools | |
| Display Modes | |
| |