![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hi all I am using Analysis Service 2005 64 bit. My dimension has 50 million rows now and will grow upto 300 millions or more in future. I set the storage type as ROLAP for dimension but the query runs very slow. So I change it to MOLAP and query runs fine I want to know can MLOAP handle such big number of rows in dimensions? What is maximum capacity of MOLAP Dimension to hold number of rows? Is there some special method to handle large dimension? Thanks |
#3
| |||
| |||
|
|
Technically speaking, SSAS 64bit should be able to manage that dimension. May be you have to get a lot of RAM and waiting enough when incrementally process that dimension (unless you use the "Add members only" option that is faster). But, do you really have a dimension with this large number of items? A lot of time I encountered customers scenario where a proper design required less resources, satisfying all analysis requirements. What kind of dimension is so large? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com wrote: Hi all I am using Analysis Service 2005 64 bit. My dimension has 50 million rows now and will grow up to 300 millions or more in future. I set the storage type as ROLAP for dimension but the query runs very slow. So I change it to MOLAP and query runs fine I want to know can MLOAP handle such big number of rows in dimensions? What is maximum capacity of MOLAP Dimension to hold number of rows? Is there some special method to handle large dimension? Thanks |
#4
| |||
| |||
|
|
Marco Russo wrote: Technically speaking, SSAS 64bit should be able to manage that dimension. May be you have to get a lot of RAM and waiting enough when incrementally process that dimension (unless you use the "Add members only" option that is faster). But, do you really have a dimension with this large number of items? A lot of time I encountered customers scenario where a proper design required less resources, satisfying all analysis requirements. What kind of dimension is so large? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com wrote: Hi all I am using Analysis Service 2005 64 bit. My dimension has 50 million rows now and will grow up to 300 millions or more in future. I set the storage type as ROLAP for dimension but the query runs very slow. So I change it to MOLAP and query runs fine I want to know can MLOAP handle such big number of rows in dimensions? What is maximum capacity of MOLAP Dimension to hold number of rows? Is there some special method to handle large dimension? Thanks Hi Marco Thanks for the answer. Yes the dimension taking too much time for processing. Its dimension of IP address who visited our client sites. We store IP address of each user. Since we are getting very high number of views our IP dimension is very large and will grow also rapidly. Right now it is 50 million at the end of 4 months and at the end of year it will be 200 million. Is there other alternative for it? When I set the dimension to ROLAP , Why query is taking much time? Does cube make any aggregation or not when I set the dimension type to ROLAP? Thanks |
#5
| |||
| |||
|
|
Yes, you don't have many alternatives. The only thing I would consider is: what is the use you do of IP address? Do you really navigate into IP addresses or you use them only to count how many DISTINCT IP addresses have visited your site into a certain period of time? Unfortunately this is a hard problem to solve with large dimensions. AS2005 can have low performance on Distinct Count measures. A partial solution is the use of a different model I described in my paper (see http://www.sqlbi.eu/manytomany.aspx) that can help you with this large numbers (but do not try my model on a 32-bit server with your numbers, it would exhaust RAM during processing). If you don't have the need to get the distinct count of IP addresses that visited your site, then you could use some different approach. The ROLAP dimension is not a solution - aggregations are not so fast like MOLAP. But, the main design question is: how many other attributes do you have on the IP addresses dimension? What kind of queries do you want to do on this dimension? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com ha scritto: Marco Russo wrote: Technically speaking, SSAS 64bit should be able to manage that dimension. May be you have to get a lot of RAM and waiting enough when incrementally process that dimension (unless you use the "Add members only" option that is faster). But, do you really have a dimension with this large number of items? A lot of time I encountered customers scenario where a proper design required less resources, satisfying all analysis requirements. What kind of dimension is so large? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com wrote: Hi all I am using Analysis Service 2005 64 bit. My dimension has 50 million rows now and will grow up to 300 millions or more in future. I set the storage type as ROLAP for dimension but the query runs very slow. So I change it to MOLAP and query runs fine I want to know can MLOAP handle such big number of rows in dimensions? What is maximum capacity of MOLAP Dimension to hold number of rows? Is there some special method to handle large dimension? Thanks Hi Marco Thanks for the answer. Yes the dimension taking too much time for processing. Its dimension of IP address who visited our client sites. We store IP address of each user. Since we are getting very high number of views our IP dimension is very large and will grow also rapidly. Right now it is 50 million at the end of 4 months and at the end of year it will be 200 million. Is there other alternative for it? When I set the dimension to ROLAP , Why query is taking much time? Does cube make any aggregation or not when I set the dimension type to ROLAP? Thanks |
#6
| |||
| |||
|
|
Hi Marco I need how many times in a day each IP visited the site. So I do not need Distinct Count. Only problem is processing time for the large dimension while storage type is MOLAP. IP dimension doesn't have any attributes. Our fact table has 800 million rows right now and it will be some decent billions at end of year. Is there any alternate for it? Thanks Marco Russo wrote: Yes, you don't have many alternatives. The only thing I would consider is: what is the use you do of IP address? Do you really navigate into IP addresses or you use them only to count how many DISTINCT IP addresses have visited your site into a certain period of time? Unfortunately this is a hard problem to solve with large dimensions. AS2005 can have low performance on Distinct Count measures. A partial solution is the use of a different model I described in my paper (see http://www.sqlbi.eu/manytomany.aspx) that can help you with this large numbers (but do not try my model on a 32-bit server with your numbers, it would exhaust RAM during processing). If you don't have the need to get the distinct count of IP addresses that visited your site, then you could use some different approach. The ROLAP dimension is not a solution - aggregations are not so fast like MOLAP. But, the main design question is: how many other attributes do you have on the IP addresses dimension? What kind of queries do you want to do on this dimension? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com ha scritto: Marco Russo wrote: Technically speaking, SSAS 64bit should be able to manage that dimension. May be you have to get a lot of RAM and waiting enough when incrementally process that dimension (unless you use the "Add members only" option that is faster). But, do you really have a dimension with this large number of items? A lot of time I encountered customers scenario where a proper design required less resources, satisfying all analysis requirements. What kind of dimension is so large? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com wrote: Hi all I am using Analysis Service 2005 64 bit. My dimension has 50 million rows now and will grow up to 300 millions or more in future. I set the storage type as ROLAP for dimension but the query runs very slow. So I change it to MOLAP and query runs fine I want to know can MLOAP handle such big number of rows in dimensions? What is maximum capacity of MOLAP Dimension to hold number of rows? Is there some special method to handle large dimension? Thanks Hi Marco Thanks for the answer. Yes the dimension taking too much time for processing. Its dimension of IP address who visited our client sites. We store IP address of each user. Since we are getting very high number of views our IP dimension is very large and will grow also rapidly. Right now it is 50 million at the end of 4 months and at the end of year it will be 200 million. Is there other alternative for it? When I set the dimension to ROLAP , Why query is taking much time? Does cube make any aggregation or not when I set the dimension type to ROLAP? Thanks |
#7
| |||
| |||
|
|
Ok, I would try to use the MOLAP dimension. You have to get a very strict dimension member size - use a int32 as dimension key member type (any IP address is a 32bit representation) and eventually move to the client the conversion into a tring representation. Use this same key into fact table, so you don't have to do any lookup conversion. Use Process Add for the dimension process (you only have new members and does not have to modifiy other attributes, right?). If you use the regular Incremental update, you have the worst case from the performance point of view. Then use partitions on measure group to split data in several partitions. May be you want to do the same for fact table in relational database. I think this could help you very much to process data into MOLAP partitions, getting faster results than ROLAP. Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com ha scritto: Hi Marco I need how many times in a day each IP visited the site. So I do not need Distinct Count. Only problem is processing time for the large dimension while storage type is MOLAP. IP dimension doesn't have any attributes. Our fact table has 800 million rows right now and it will be some decent billions at end of year. Is there any alternate for it? Thanks Marco Russo wrote: Yes, you don't have many alternatives. The only thing I would consider is: what is the use you do of IP address? Do you really navigate into IP addresses or you use them only to count how many DISTINCT IP addresses have visited your site into a certain period of time? Unfortunately this is a hard problem to solve with large dimensions. AS2005 can have low performance on Distinct Count measures. A partial solution is the use of a different model I described in my paper (see http://www.sqlbi.eu/manytomany.aspx) that can help you with this large numbers (but do not try my model on a 32-bit server with your numbers, it would exhaust RAM during processing). If you don't have the need to get the distinct count of IP addresses that visited your site, then you could use some different approach. The ROLAP dimension is not a solution - aggregations are not so fast like MOLAP. But, the main design question is: how many other attributes do you have on the IP addresses dimension? What kind of queries do you want to do on this dimension? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com ha scritto: Marco Russo wrote: Technically speaking, SSAS 64bit should be able to manage that dimension. May be you have to get a lot of RAM and waiting enough when incrementally process that dimension (unless you use the "Add members only" option that is faster). But, do you really have a dimension with this large number of items? A lot of time I encountered customers scenario where a proper design required less resources, satisfying all analysis requirements. What kind of dimension is so large? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com wrote: Hi all I am using Analysis Service 2005 64 bit. My dimension has 50 million rows now and will grow up to 300 millions or more in future. I set the storage type as ROLAP for dimension but the query runs very slow. So I change it to MOLAP and query runs fine I want to know can MLOAP handle such big number of rows in dimensions? What is maximum capacity of MOLAP Dimension to hold number of rows? Is there some special method to handle large dimension? Thanks Hi Marco Thanks for the answer. Yes the dimension taking too much time for processing. Its dimension of IP address who visited our client sites. We store IP address of each user. Since we are getting very high number of views our IP dimension is very large and will grow also rapidly. Right now it is 50 million at the end of 4 months and at the end of year it will be 200 million. Is there other alternative for it? When I set the dimension to ROLAP , Why query is taking much time? Does cube make any aggregation or not when I set the dimension type to ROLAP? Thanks |
#8
| |||
| |||
|
|
Thanks Marco, I am using Process Update and it takes too much time.While processing cube I can not find Process Add option. How to do Process Add instead of Process Update? I am using SQL Server 2005 ,x64 . Regards Marco Russo wrote: Ok, I would try to use the MOLAP dimension. You have to get a very strict dimension member size - use a int32 as dimension key member type (any IP address is a 32bit representation) and eventually move to the client the conversion into a tring representation. Use this same key into fact table, so you don't have to do any lookup conversion. Use Process Add for the dimension process (you only have new members and does not have to modifiy other attributes, right?). If you use the regular Incremental update, you have the worst case from the performance point of view. Then use partitions on measure group to split data in several partitions. May be you want to do the same for fact table in relational database. I think this could help you very much to process data into MOLAP partitions, getting faster results than ROLAP. Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com ha scritto: Hi Marco I need how many times in a day each IP visited the site. So I do not need Distinct Count. Only problem is processing time for the large dimension while storage type is MOLAP. IP dimension doesn't have any attributes. Our fact table has 800 million rows right now and it will be some decent billions at end of year. Is there any alternate for it? Thanks Marco Russo wrote: Yes, you don't have many alternatives. The only thing I would consider is: what is the use you do of IP address? Do you really navigate into IP addresses or you use them only to count how many DISTINCT IP addresses have visited your site into a certain period of time? Unfortunately this is a hard problem to solve with large dimensions. AS2005 can have low performance on Distinct Count measures. A partial solution is the use of a different model I described in my paper (see http://www.sqlbi.eu/manytomany.aspx) that can help you with this large numbers (but do not try my model on a 32-bit server with your numbers, it would exhaust RAM during processing). If you don't have the need to get the distinct count of IP addresses that visited your site, then you could use some different approach. The ROLAP dimension is not a solution - aggregations are not so fast like MOLAP. But, the main design question is: how many other attributes do you have on the IP addresses dimension? What kind of queries do you want to do on this dimension? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com ha scritto: Marco Russo wrote: Technically speaking, SSAS 64bit should be able to manage that dimension. May be you have to get a lot of RAM and waiting enough when incrementally process that dimension (unless you use the "Add members only" option that is faster). But, do you really have a dimension with this large number of items? A lot of time I encountered customers scenario where a proper design required less resources, satisfying all analysis requirements. What kind of dimension is so large? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com wrote: Hi all I am using Analysis Service 2005 64 bit. My dimension has 50 million rows now and will grow up to 300 millions or more in future. I set the storage type as ROLAP for dimension but the query runs very slow. So I change it to MOLAP and query runs fine I want to know can MLOAP handle such big number of rows in dimensions? What is maximum capacity of MOLAP Dimension to hold number of rows? Is there some special method to handle large dimension? Thanks Hi Marco Thanks for the answer. Yes the dimension taking too much time for processing. Its dimension of IP address who visited our client sites. We store IP address of each user. Since we are getting very high number of views our IP dimension is very large and will grow also rapidly. Right now it is 50 million at the end of 4 months and at the end of year it will be 200 million. Is there other alternative for it? When I set the dimension to ROLAP , Why query is taking much time? Does cube make any aggregation or not when I set the dimension type to ROLAP? Thanks |
#9
| |||
| |||
|
|
It is not available in the wizard. You have to schedule the process using XML/A. See http://msdn2.microsoft.com/en-us/lib..._asproc_topic3 and http://msdn2.microsoft.com/en-us/lib..._asproc_topic7 Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com wrote: Thanks Marco, I am using Process Update and it takes too much time.While processing cube I can not find Process Add option. How to do Process Add instead of Process Update? I am using SQL Server 2005 ,x64 . Regards Marco Russo wrote: Ok, I would try to use the MOLAP dimension. You have to get a very strict dimension member size - use a int32 as dimension key member type (any IP address is a 32bit representation) and eventually move to the client the conversion into a tring representation. Use this same key into fact table, so you don't have to do any lookup conversion. Use Process Add for the dimension process (you only have new members and does not have to modifiy other attributes, right?). If you use the regular Incremental update, you have the worst case from the performance point of view. Then use partitions on measure group to split data in several partitions. May be you want to do the same for fact table in relational database. I think this could help you very much to process data into MOLAP partitions, getting faster results than ROLAP. Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com ha scritto: Hi Marco I need how many times in a day each IP visited the site. So I do not need Distinct Count. Only problem is processing time for the large dimension while storage type is MOLAP. IP dimension doesn't have any attributes. Our fact table has 800 million rows right now and it will be some decent billions at end of year. Is there any alternate for it? Thanks Marco Russo wrote: Yes, you don't have many alternatives. The only thing I would consider is: what is the use you do of IP address? Do you really navigate into IP addresses or you use them only to count how many DISTINCT IP addresses have visited your site into a certain period of time? Unfortunately this is a hard problem to solve with large dimensions. AS2005 can have low performance on Distinct Count measures. A partial solution is the use of a different model I described in my paper (see http://www.sqlbi.eu/manytomany.aspx) that can help you with this large numbers (but do not try my model on a 32-bit server with your numbers, it would exhaust RAM during processing). If you don't have the need to get the distinct count of IP addresses that visited your site, then you could use some different approach. The ROLAP dimension is not a solution - aggregations are not so fast like MOLAP. But, the main design question is: how many other attributes do you have on the IP addresses dimension? What kind of queries do you want to do on this dimension? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com ha scritto: Marco Russo wrote: Technically speaking, SSAS 64bit should be able to manage that dimension. May be you have to get a lot of RAM and waiting enough when incrementally process that dimension (unless you use the "Add members only" option that is faster). But, do you really have a dimension with this large number of items? A lot of time I encountered customers scenario where a proper design required less resources, satisfying all analysis requirements. What kind of dimension is so large? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com wrote: Hi all I am using Analysis Service 2005 64 bit. My dimension has 50 million rows now and will grow up to 300 millions or more in future. I set the storage type as ROLAP for dimension but the query runs very slow. So I change it to MOLAP and query runs fine I want to know can MLOAP handle such big number of rows in dimensions? What is maximum capacity of MOLAP Dimension to hold number of rows? Is there some special method to handle large dimension? Thanks Hi Marco Thanks for the answer. Yes the dimension taking too much time for processing. Its dimension of IP address who visited our client sites. We store IP address of each user. Since we are getting very high number of views our IP dimension is very large and will grow also rapidly. Right now it is 50 million at the end of 4 months and at the end of year it will be 200 million. Is there other alternative for it? When I set the dimension to ROLAP , Why query is taking much time? Does cube make any aggregation or not when I set the dimension type to ROLAP? Thanks |
#10
| |||
| |||
|
|
Thanks Marco I ran the xmla script Process xmlns="http://schemas.microsoft.com/analysisservices/2003/engine" Object DatabaseID>Adventure Works DW</DatabaseID DimensionID>Dim Customer</DimensionID /Object Type>ProcessAdd</Type DataSourceView ID>Adventure Works DW</ID Name>Adventure Works DW</Name DataSourceID>Adventure Works DW</DataSourceID Schema ... xs:element name="dbo_DimCustomer" msprop:QueryDefinition="SELECT * FROM DimCustomer WHERE CustomerKey > 29483" msprop bTableName="DimCustomer"msprop:IsLogical="True" msprop:TableType="View" ... /Schema /DataSourceView /Process But getting error Executing the query ... XML parsing failed at line 23, column 49: Undeclared prefix. . Execution complete Can you post some query which will work on Adeventure Works Cube? Regards Marco Russo wrote: It is not available in the wizard. You have to schedule the process using XML/A. See http://msdn2.microsoft.com/en-us/lib..._asproc_topic3 and http://msdn2.microsoft.com/en-us/lib..._asproc_topic7 Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com wrote: Thanks Marco, I am using Process Update and it takes too much time.While processing cube I can not find Process Add option. How to do Process Add instead of Process Update? I am using SQL Server 2005 ,x64 . Regards Marco Russo wrote: Ok, I would try to use the MOLAP dimension. You have to get a very strict dimension member size - use a int32 as dimension key member type (any IP address is a 32bit representation) and eventually move to the client the conversion into a tring representation. Use this same key into fact table, so you don't have to do any lookup conversion. Use Process Add for the dimension process (you only have new members and does not have to modifiy other attributes, right?). If you use the regular Incremental update, you have the worst case from the performance point of view. Then use partitions on measure group to split data in several partitions. May be you want to do the same for fact table in relational database. I think this could help you very much to process data into MOLAP partitions, getting faster results than ROLAP. Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com ha scritto: Hi Marco I need how many times in a day each IP visited the site. So I do not need Distinct Count. Only problem is processing time for the large dimension while storage type is MOLAP. IP dimension doesn't have any attributes. Our fact table has 800 million rows right now and it will be some decent billions at end of year. Is there any alternate for it? Thanks Marco Russo wrote: Yes, you don't have many alternatives. The only thing I would consider is: what is the use you do of IP address? Do you really navigate into IP addresses or you use them only to count how many DISTINCT IP addresses have visited your site into a certain period of time? Unfortunately this is a hard problem to solve with large dimensions. AS2005 can have low performance on Distinct Count measures. A partial solution is the use of a different model I described in my paper (see http://www.sqlbi.eu/manytomany.aspx) that can help you with this large numbers (but do not try my model on a 32-bit server with your numbers, it would exhaust RAM during processing). If you don't have the need to get the distinct count of IP addresses that visited your site, then you could use some different approach. The ROLAP dimension is not a solution - aggregations are not so fast like MOLAP. But, the main design question is: how many other attributes do you have on the IP addresses dimension? What kind of queries do you want to do on this dimension? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com ha scritto: Marco Russo wrote: Technically speaking, SSAS 64bit should be able to manage that dimension. May be you have to get a lot of RAM and waiting enough when incrementally process that dimension (unless you use the "Add members only" option that is faster). But, do you really have a dimension with this large number of items? A lot of time I encountered customers scenario where a proper design required less resources, satisfying all analysis requirements. What kind of dimension is so large? Marco Russo http://www.sqlbi.eu http://www.sqljunkies.com/weblog/sqlbi sqlprobs (AT) gmail (DOT) com wrote: Hi all I am using Analysis Service 2005 64 bit. My dimension has 50 million rows now and will grow up to 300 millions or more in future. I set the storage type as ROLAP for dimension but the query runs very slow. So I change it to MOLAP and query runs fine I want to know can MLOAP handle such big number of rows in dimensions? What is maximum capacity of MOLAP Dimension to hold number of rows? Is there some special method to handle large dimension? Thanks Hi Marco Thanks for the answer. Yes the dimension taking too much time for processing. Its dimension of IP address who visited our client sites. We store IP address of each user. Since we are getting very high number of views our IP dimension is very large and will grow also rapidly. Right now it is 50 million at the end of 4 months and at the end of year it will be 200 million. Is there other alternative for it? When I set the dimension to ROLAP , Why query is taking much time? Does cube make any aggregation or not when I set the dimension type to ROLAP? Thanks |
![]() |
| Thread Tools | |
| Display Modes | |
| |