![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hi, We are a performance issue with a cube in AS2005 with the following characteristics: . Measure: o 1 DistinctCount - MeasureA. o 1 Sum - MeasureB. o 1 Calculated - MeasureC = MeasureB/MeasureA. Empty behavior is activated for both base measures. . Data can be divided in 2 clusters: o For data type X we load ~3 million rows per month (partition) to AS2005. o For data type Y we load ~25 million rows per month (partition) to AS2005. . Dimensions: 11 dimensions with 22 hierarchies. . One dimension has a default value to restrict data to type Y. . We have 24 partitions - 12 monthly partitions for each data type. All partitions have defined slices for both month and data type. The performance is very poor and we are not able to reduce the data that feeds AS2005. How can we improve the performance? We tried creating aggregations but the partitions' directory increase "exponentially". Wasn't supposed for AS2005 to improve the performance for DistinctCount measures? Any help will be appreciated. Thanks in advance. |
#3
| |||
| |||
|
|
can you quickly descibe your hardware and the time taken to answer your slow queries? what is the % of aggregations? does the dcount and the Sum measures are into 2 separate measure group? does the problem still here with a warm cache? (reecuting the same queries take the same time or the performance is better?) "Tiago Rente" <TiagoRente (AT) discussions (DOT) microsoft.com> wrote in message news:F2BEB525-2FCA-4F49-A135-4B17E503D97F (AT) microsoft (DOT) com... Hi, We are a performance issue with a cube in AS2005 with the following characteristics: . Measure: o 1 DistinctCount - MeasureA. o 1 Sum - MeasureB. o 1 Calculated - MeasureC = MeasureB/MeasureA. Empty behavior is activated for both base measures. . Data can be divided in 2 clusters: o For data type X we load ~3 million rows per month (partition) to AS2005. o For data type Y we load ~25 million rows per month (partition) to AS2005. . Dimensions: 11 dimensions with 22 hierarchies. . One dimension has a default value to restrict data to type Y. . We have 24 partitions - 12 monthly partitions for each data type. All partitions have defined slices for both month and data type. The performance is very poor and we are not able to reduce the data that feeds AS2005. How can we improve the performance? We tried creating aggregations but the partitions' directory increase "exponentially". Wasn't supposed for AS2005 to improve the performance for DistinctCount measures? Any help will be appreciated. Thanks in advance. |
#4
| |||
| |||
|
|
Hi, We have two environments: 1. 32-bit HP server with 8 x 2.5 GHz Xeon - 12 GB RAM - running W2K3 SP1 w/ parameter /3Gb. 2. 64-bit HP server with 4 x 1.5 GHz Itanium 2 - 16 GB RAM - running W2K3 SP1. Queries take normally more than 10 minutes do execute. Currently we do not have set the aggregations since we are not able to do the usage based optimization. I.e., Althought, we have queries in the QueryLog table the aggreagations created in the usage based optimization are not related to just the queries done by the users! In a previous version of the cube creating ~30% aggreagations, each partition had a file with almost 2 GB in size. This is not practical since we process the cubes in a staging server and then copied to the end-users server. The 2 measures are in the same measure group since the 3rd measure uses the other 2. I forgot to set the default measure as the MeasureB (sum). Would it increase the performance? Even if I execute the some report repeatly the second/third time takes more or less the same time. However, there were some reports that were execute is seconds, but this was a never refreshed query. Most of the tests were done in the 32-bit server. Thanks. "Jj" wrote: can you quickly descibe your hardware and the time taken to answer your slow queries? what is the % of aggregations? does the dcount and the Sum measures are into 2 separate measure group? does the problem still here with a warm cache? (reecuting the same queries take the same time or the performance is better?) "Tiago Rente" <TiagoRente (AT) discussions (DOT) microsoft.com> wrote in message news:F2BEB525-2FCA-4F49-A135-4B17E503D97F (AT) microsoft (DOT) com... Hi, We are a performance issue with a cube in AS2005 with the following characteristics: . Measure: o 1 DistinctCount - MeasureA. o 1 Sum - MeasureB. o 1 Calculated - MeasureC = MeasureB/MeasureA. Empty behavior is activated for both base measures. . Data can be divided in 2 clusters: o For data type X we load ~3 million rows per month (partition) to AS2005. o For data type Y we load ~25 million rows per month (partition) to AS2005. . Dimensions: 11 dimensions with 22 hierarchies. . One dimension has a default value to restrict data to type Y. . We have 24 partitions - 12 monthly partitions for each data type. All partitions have defined slices for both month and data type. The performance is very poor and we are not able to reduce the data that feeds AS2005. How can we improve the performance? We tried creating aggregations but the partitions' directory increase "exponentially". Wasn't supposed for AS2005 to improve the performance for DistinctCount measures? Any help will be appreciated. Thanks in advance. |
#5
| |||
| |||
|
|
the third measure? the calculated measure is in a separate measuregroup??? its really important to create a dedicated measuregroup for your DCount measure. (with only this measure in this group) how the non empty behavior is setup for the calculated measure? also, create starting aggregations. this size on the disk is required, no aggregation, no performance. there is no miracle. I have a cube with 2 dcounts (1 against 80 millions of rows / partition; second against 10 millions / partition; + standard sums and count measure) the response time is 10 seconds on a 2 * dual core server and with a warm cache, the same result takes only 1 or 2 seconds. my formula divide the 2 dcount measures. (and I have other measures like sum / dcount) "Tiago Rente" <TiagoRente (AT) discussions (DOT) microsoft.com> wrote in message news:260C7F37-6F7B-44C8-8B59-6A4D8DD12D2B (AT) microsoft (DOT) com... Hi, We have two environments: 1. 32-bit HP server with 8 x 2.5 GHz Xeon - 12 GB RAM - running W2K3 SP1 w/ parameter /3Gb. 2. 64-bit HP server with 4 x 1.5 GHz Itanium 2 - 16 GB RAM - running W2K3 SP1. Queries take normally more than 10 minutes do execute. Currently we do not have set the aggregations since we are not able to do the usage based optimization. I.e., Althought, we have queries in the QueryLog table the aggreagations created in the usage based optimization are not related to just the queries done by the users! In a previous version of the cube creating ~30% aggreagations, each partition had a file with almost 2 GB in size. This is not practical since we process the cubes in a staging server and then copied to the end-users server. The 2 measures are in the same measure group since the 3rd measure uses the other 2. I forgot to set the default measure as the MeasureB (sum). Would it increase the performance? Even if I execute the some report repeatly the second/third time takes more or less the same time. However, there were some reports that were execute is seconds, but this was a never refreshed query. Most of the tests were done in the 32-bit server. Thanks. "Jéjé" wrote: can you quickly descibe your hardware and the time taken to answer your slow queries? what is the % of aggregations? does the dcount and the Sum measures are into 2 separate measure group? does the problem still here with a warm cache? (reecuting the same queries take the same time or the performance is better?) "Tiago Rente" <TiagoRente (AT) discussions (DOT) microsoft.com> wrote in message news:F2BEB525-2FCA-4F49-A135-4B17E503D97F (AT) microsoft (DOT) com... Hi, We are a performance issue with a cube in AS2005 with the following characteristics: . Measure: o 1 DistinctCount - MeasureA. o 1 Sum - MeasureB. o 1 Calculated - MeasureC = MeasureB/MeasureA. Empty behavior is activated for both base measures. . Data can be divided in 2 clusters: o For data type X we load ~3 million rows per month (partition) to AS2005. o For data type Y we load ~25 million rows per month (partition) to AS2005. . Dimensions: 11 dimensions with 22 hierarchies. . One dimension has a default value to restrict data to type Y. . We have 24 partitions - 12 monthly partitions for each data type. All partitions have defined slices for both month and data type. The performance is very poor and we are not able to reduce the data that feeds AS2005. How can we improve the performance? We tried creating aggregations but the partitions' directory increase "exponentially". Wasn't supposed for AS2005 to improve the performance for DistinctCount measures? Any help will be appreciated. Thanks in advance. |
#6
| |||
| |||
|
|
All measures are in the same measure group. If I understood correctly you suggestion is: * Create a measure group (MeasureGroupX) just for the DistinctCount MeasureA. * Create a measure group (MeasureGroupY) containing the other to measures MeasureB + MeasureC (= MeasureB/MeasureA). * Create aggregations. Do I need to create aggregations on the MeasureGroupX? If so how do I ensure that only the aggregations for which I a query from the users is created? I tried creating aggregations using the usage-based optimization (UBO) it created more aggregations then queries. I had 2 distinct queries and it created more than 100 aggregations when I stopped the UBO. In AS2000 the UBO does not create more aggregations besides the ones in the QueryLog. Creating the two measure groups doesn't it implies loading the million of rows twice?? Did you change anything in the AS2005 configuration? E.g., OLAP\ProcessPlan\DistinctBuffer. I will try to create the two measure groups. Thanks. "Jj" wrote: the third measure? the calculated measure is in a separate measuregroup??? its really important to create a dedicated measuregroup for your DCount measure. (with only this measure in this group) how the non empty behavior is setup for the calculated measure? also, create starting aggregations. this size on the disk is required, no aggregation, no performance. there is no miracle. I have a cube with 2 dcounts (1 against 80 millions of rows / partition; second against 10 millions / partition; + standard sums and count measure) the response time is 10 seconds on a 2 * dual core server and with a warm cache, the same result takes only 1 or 2 seconds. my formula divide the 2 dcount measures. (and I have other measures like sum / dcount) "Tiago Rente" <TiagoRente (AT) discussions (DOT) microsoft.com> wrote in message news:260C7F37-6F7B-44C8-8B59-6A4D8DD12D2B (AT) microsoft (DOT) com... Hi, We have two environments: 1. 32-bit HP server with 8 x 2.5 GHz Xeon - 12 GB RAM - running W2K3 SP1 w/ parameter /3Gb. 2. 64-bit HP server with 4 x 1.5 GHz Itanium 2 - 16 GB RAM - running W2K3 SP1. Queries take normally more than 10 minutes do execute. Currently we do not have set the aggregations since we are not able to do the usage based optimization. I.e., Althought, we have queries in the QueryLog table the aggreagations created in the usage based optimization are not related to just the queries done by the users! In a previous version of the cube creating ~30% aggreagations, each partition had a file with almost 2 GB in size. This is not practical since we process the cubes in a staging server and then copied to the end-users server. The 2 measures are in the same measure group since the 3rd measure uses the other 2. I forgot to set the default measure as the MeasureB (sum). Would it increase the performance? Even if I execute the some report repeatly the second/third time takes more or less the same time. However, there were some reports that were execute is seconds, but this was a never refreshed query. Most of the tests were done in the 32-bit server. Thanks. "Jj" wrote: can you quickly descibe your hardware and the time taken to answer your slow queries? what is the % of aggregations? does the dcount and the Sum measures are into 2 separate measure group? does the problem still here with a warm cache? (reecuting the same queries take the same time or the performance is better?) "Tiago Rente" <TiagoRente (AT) discussions (DOT) microsoft.com> wrote in message news:F2BEB525-2FCA-4F49-A135-4B17E503D97F (AT) microsoft (DOT) com... Hi, We are a performance issue with a cube in AS2005 with the following characteristics: . Measure: o 1 DistinctCount - MeasureA. o 1 Sum - MeasureB. o 1 Calculated - MeasureC = MeasureB/MeasureA. Empty behavior is activated for both base measures. . Data can be divided in 2 clusters: o For data type X we load ~3 million rows per month (partition) to AS2005. o For data type Y we load ~25 million rows per month (partition) to AS2005. . Dimensions: 11 dimensions with 22 hierarchies. . One dimension has a default value to restrict data to type Y. . We have 24 partitions - 12 monthly partitions for each data type. All partitions have defined slices for both month and data type. The performance is very poor and we are not able to reduce the data that feeds AS2005. How can we improve the performance? We tried creating aggregations but the partitions' directory increase "exponentially". Wasn't supposed for AS2005 to improve the performance for DistinctCount measures? Any help will be appreciated. Thanks in advance. |
![]() |
| Thread Tools | |
| Display Modes | |
| |