![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hello, Can someone please make me understand how data profiling helps? Okay. So it helps the ETL process. But how? I have been googling on data profiling but any information I get is vague. Take, for instance, http://it-director.com/article.php?articleid=3492. The article is about quote>the importance of understanding your data</quote>. But I don't get it. There is no specific example of data profiling. Ok. The dude mentions there are three problems that data profiling tries to solve. 1. Lot of data stored in legacy system and is poorly documented. SO HOW DOES DATA PROFILING HELP? DO YOU ANALYZE A TABLE JUST TO UNDERSTAND THE DATA TYPES OF EACH COLUMN? IS THAT DATA PROFILING ALL ABOUT? 2. The data does not match the metadata that describes it. OKAY. THE LEGACY SYSTEM I AM WORKING WITH IS AN ACCESS DATABASE. HOW CAN A COLUMN THAT IS OF TYPE NUMERIC EVER STORE ALPHANUMERIC CHARACTERS IN IT? ANY DATABASE SYSTEM, LEGACY OR NOT, WOULD NOT SIMPLY LET YOU VIOLATE THE DATA TYPE CONSTRAINTS. WHAT AM I MISSING? 3. The data itself contains errors and is inconsitent. THIS MUST BE HANDLED BY DATA "CLEANSING" AND NOT DATA "PROFILING." I am sorry for using upper case. But I wanted to draw your attention. My manager wants me to learn all about data profiling but I just don't get it. I would appreciate it if someone can give me some concrete examples of data profiling (and how they benefit) so that when we can choose the right ETL tool. Thank you in advance for your help. Vrushali |
#3
| |||
| |||
|
#4
| |||
| |||
|
|
1. Lot of data stored in legacy system and is poorly documented. SO HOW DOES DATA PROFILING HELP? DO YOU ANALYZE A TABLE JUST TO UNDERSTAND THE DATA TYPES OF EACH COLUMN? IS THAT DATA PROFILING ALL ABOUT? |
|
2. The data does not match the metadata that describes it. OKAY. THE LEGACY SYSTEM I AM WORKING WITH IS AN ACCESS DATABASE. HOW CAN A COLUMN THAT IS OF TYPE NUMERIC EVER STORE ALPHANUMERIC CHARACTERS IN IT? ANY DATABASE SYSTEM, LEGACY OR NOT, WOULD NOT SIMPLY LET YOU VIOLATE THE DATA TYPE CONSTRAINTS. WHAT AM I MISSING? |
|
3. The data itself contains errors and is inconsitent. THIS MUST BE HANDLED BY DATA "CLEANSING" AND NOT DATA "PROFILING." I am sorry for using upper case. But I wanted to draw your attention. My manager wants me to learn all about data profiling but I just don't get it. I would appreciate it if someone can give me some concrete examples of data profiling (and how they benefit) so that when we can choose the right ETL tool. |
#5
| |||
| |||
|
|
OKAY. THE LEGACY SYSTEM I AM WORKING WITH IS AN ACCESS DATABASE. HOW CAN A COLUMN THAT IS OF TYPE NUMERIC EVER STORE ALPHANUMERIC CHARACTERS IN IT? ANY DATABASE SYSTEM, LEGACY OR NOT, WOULD NOT SIMPLY LET YOU VIOLATE THE DATA TYPE CONSTRAINTS. WHAT AM I MISSING? |
![]() |
| Thread Tools | |
| Display Modes | |
| |