![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
I have a cube with about 3.5 million rows in the fact table. I clone that partition and populate it from the same database but using only 10000 rows (using an appropriate filter clause). When I merge this new partition back into the original partition, I notice that the cubename.fact.data file increases from 5445KB to 5520KB. I thought that this was odd since the cells that are being populated with data already had data in them. Thus, I wasn't expecting the file size to grow at all. I tried again with various sizes for the second partition and got the following results: 10000 rows => 5520 KB 100000 rows => 5636 KB 500000 rows => 6363 KB 1000000 rows => 7459 KB The fact that the cubename.fact.data file increased in size was slightly odd, but could be explain if the merged data was stored separately. But here's the strange part. When I repeatedly merge a similar sized partition, the file size does not continue to increase, except for the case of 1,000,000 rows. For example, I merged the same 500,000 row partition back into the original 100 times and the file size never went higher than 6363 KB. However, when the new partition contained 1,000,000 rows, the file size increased as follows: 7459 KB 8867 KB 10275 KB Always the same increment of 1408KB. The larger file sizes also mean that the merge times for a partition of that size take progressively longer. Does anyone have any insight into this behavior? |
#3
| |||
| |||
|
|
Internally, the .data files are broken into segments, and merge algorithm merges segment by segment. Therefore if matching rows happen to belong to the different segments the size increases. -- ================================================== Mosha Pasumansky - http://www.mosha.com/msolap Development Lead in the Analysis Server team All you need is love (John Lennon) Disclaimer : This posting is provided "AS IS" with no warranties, and confers no rights. ================================================== "David Hwang" <davidhwang (AT) usa (DOT) com> wrote in message news:d9f37fcd.0402251449.67b9ab88 (AT) posting (DOT) google.com... I have a cube with about 3.5 million rows in the fact table. I clone that partition and populate it from the same database but using only 10000 rows (using an appropriate filter clause). When I merge this new partition back into the original partition, I notice that the cubename.fact.data file increases from 5445KB to 5520KB. I thought that this was odd since the cells that are being populated with data already had data in them. Thus, I wasn't expecting the file size to grow at all. I tried again with various sizes for the second partition and got the following results: 10000 rows => 5520 KB 100000 rows => 5636 KB 500000 rows => 6363 KB 1000000 rows => 7459 KB The fact that the cubename.fact.data file increased in size was slightly odd, but could be explain if the merged data was stored separately. But here's the strange part. When I repeatedly merge a similar sized partition, the file size does not continue to increase, except for the case of 1,000,000 rows. For example, I merged the same 500,000 row partition back into the original 100 times and the file size never went higher than 6363 KB. However, when the new partition contained 1,000,000 rows, the file size increased as follows: 7459 KB 8867 KB 10275 KB Always the same increment of 1408KB. The larger file sizes also mean that the merge times for a partition of that size take progressively longer. Does anyone have any insight into this behavior? |
#4
| |||
| |||
|
|
Thanks for the answer. In the future, will it be possible to compact the .data files after large merges? |
![]() |
| Thread Tools | |
| Display Modes | |
| |