data cleansing: externally or internally? -
11-04-2011
, 12:51 AM
there is a big text file with dirty data. a company wants it to be
clean. there are some known patterns expressed as like or regexp. I
first thought about two approaches:
1) do this on the system level
2) or in a database
for the latter case it looks to me that I could use external tables or
load data into temporary table and then do the cleaning.
I am looking for pros and cons of each variant. my intuition tells me
that loading into temporary table would give the most flexibility but
also take additional space. I am not sure about the other methods. I
would appreciate your opinion about what I should pay attention to when
choosing the other methods. how are they restricted in terms of
performance, flexibility and capabilities (eg. multitable loading)? I am
also interested in good practices and your experience in similar cases
you can share.
thank you,
geos
--
NOTE: Follow Up set to comp.databases.oracle.misc |