![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hello, I am very perplexed with this problem I'm having. I work for a website-based company and my job is to create a data warehouse based on usage of the site. We have a Central Logging Database that logs every single hit to the site, then i have a stored procedure that picks up these hits, does a bit of data scrubbing and transformation, and puts them into a Warehouse database. i then have an Analysis Services cube that points at this warehouse database. I then use Reporting Services and MDX to create reports based on this usage info. Scenario: the total hits in the database can be further grouped into companies, so reports can be run to show a company's usage of the site. Companies are then further divided into Users belonging to a company. I.e. reports are generally run to show a company's usage, broken down into individual users within that company. In order to check that my warehouse-load stored procedure works properly I compared the amount of hits in the Central Logging Database (where all hits are originally recorded) with hits in the final Warehouse database to see if they balance. Everything looks fine and hunky-dory. My problem is this: When I browse the cube in analysis manager, some of the companies have a higher hit count, i.e. show more hits, than actually exist in any of the original databases! In the cube one company is even showing 1,000 hits for a user who does not have ANY hits in any of the original databases!!! the total amount of hits shown in the cube is equal to the original databases, so some how it balances. but i don't understand where all this extra\incorrect data is coming from. My cube consists of a very simple Star-schema design. one main table called Hits, and UserAccount, BusinessEntity, and Date dimensions. the Hits table consists of the following columns: Hit_ID (int) DatabaseSource (int) UserAccountID (int) BusinessEntityID (int) LogDateID (int) One hit is defined by a Hit_ID and DatabaseSource (these two form a composite primary key). My Measure in the cube is the Hit_ID and I'm doing a Count on it. Does anyone know what i could possibly be doing wrong? I would be very grateful for any bit of help anyone could provide. Thanks in advance! Maria |
#3
| |||
| |||
|
|
I think the best way to troubleshoot it is to do DrillThorugh on the cell which shows wrong numbers - and track down the records in the fact table that contributed to it. -- ================================================== Mosha Pasumansky - http://www.mosha.com/msolap Analysis Services blog at http://www.sqljunkies.com/WebLog/mosha Development Lead in the Analysis Server team All you need is love (John Lennon) Disclaimer : This posting is provided "AS IS" with no warranties, and confers no rights. ================================================== "Maria" <Maria (AT) discussions (DOT) microsoft.com> wrote in message news:F2356D40-EA12-4D2F-B60E-F0243E977C22 (AT) microsoft (DOT) com... Hello, I am very perplexed with this problem I'm having. I work for a website-based company and my job is to create a data warehouse based on usage of the site. We have a Central Logging Database that logs every single hit to the site, then i have a stored procedure that picks up these hits, does a bit of data scrubbing and transformation, and puts them into a Warehouse database. i then have an Analysis Services cube that points at this warehouse database. I then use Reporting Services and MDX to create reports based on this usage info. Scenario: the total hits in the database can be further grouped into companies, so reports can be run to show a company's usage of the site. Companies are then further divided into Users belonging to a company. I.e. reports are generally run to show a company's usage, broken down into individual users within that company. In order to check that my warehouse-load stored procedure works properly I compared the amount of hits in the Central Logging Database (where all hits are originally recorded) with hits in the final Warehouse database to see if they balance. Everything looks fine and hunky-dory. My problem is this: When I browse the cube in analysis manager, some of the companies have a higher hit count, i.e. show more hits, than actually exist in any of the original databases! In the cube one company is even showing 1,000 hits for a user who does not have ANY hits in any of the original databases!!! the total amount of hits shown in the cube is equal to the original databases, so some how it balances. but i don't understand where all this extra\incorrect data is coming from. My cube consists of a very simple Star-schema design. one main table called Hits, and UserAccount, BusinessEntity, and Date dimensions. the Hits table consists of the following columns: Hit_ID (int) DatabaseSource (int) UserAccountID (int) BusinessEntityID (int) LogDateID (int) One hit is defined by a Hit_ID and DatabaseSource (these two form a composite primary key). My Measure in the cube is the Hit_ID and I'm doing a Count on it. Does anyone know what i could possibly be doing wrong? I would be very grateful for any bit of help anyone could provide. Thanks in advance! Maria |
![]() |
| Thread Tools | |
| Display Modes | |
| |