![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
there is a big text file with dirty data. a company wants it to be clean. there are some known patterns expressed as like or regexp. I first thought about two approaches: 1) do this on the system level |
|
2) or in a database for the latter case it looks to me that I could use external tables or load data into temporary table and then do the cleaning. I am looking for pros and cons of each variant. my intuition tells me that loading into temporary table would give the most flexibility but also take additional space. I am not sure about the other methods. I would appreciate your opinion about what I should pay attention to when choosing the other methods. how are they restricted in terms of performance, flexibility and capabilities (eg. multitable loading)? I am also interested in good practices and your experience in similar cases you can share. |
#3
| |||
| |||
|
|
there is a big text file with dirty data. a company wants it to be clean. there are some known patterns expressed as like or regexp. I first thought about two approaches: 1) do this on the system level 2) or in a database for the latter case it looks to me that I could use external tables or load data into temporary table and then do the cleaning. I am looking for pros and cons of each variant. my intuition tells me that loading into temporary table would give the most flexibility but also take additional space. I am not sure about the other methods. I would appreciate your opinion about what I should pay attention to when choosing the other methods. how are they restricted in terms of performance, flexibility and capabilities (eg. multitable loading)? I am also interested in good practices and your experience in similar cases you can share. thank you, geos -- NOTE: Follow Up set to comp.databases.oracle.misc |
#4
| |||
| |||
|
|
there is a big text file with dirty data. |
|
a company wants it to be clean. there are some known patterns expressed as like or regexp. I first thought about two approaches: 1) do this on the system level 2) or in a database |
#5
| |||
| |||
|
#6
| |||
| |||
|
|
there is a big text file with dirty data. a company wants it to be clean. there are some known patterns expressed as like or regexp. I first thought about two approaches: 1) do this on the system level 2) or in a database for the latter case it looks to me that I could use external tables or load data into temporary table and then do the cleaning. I am looking for pros and cons of each variant. my intuition tells me that loading into temporary table would give the most flexibility but also take additional space. I am not sure about the other methods. I would appreciate your opinion about what I should pay attention to when choosing the other methods. how are they restricted in terms of performance, flexibility and capabilities (eg. multitable loading)? I am also interested in good practices and your experience in similar cases you can share. thank you, geos -- NOTE: Follow Up set to comp.databases.oracle.misc |
#7
| |||
| |||
|
#8
| |||
| |||
|
#9
| |||
| |||
|
|
there is a big text file with dirty data. a company wants it to be clean. there are some known patterns expressed as like or regexp. I first thought about two approaches: 1) do this on the system level 2) or in a database for the latter case it looks to me that I could use external tables or load data into temporary table and then do the cleaning. I am looking for pros and cons of each variant. my intuition tells me that loading into temporary table would give the most flexibility but also take additional space. I am not sure about the other methods. I would appreciate your opinion about what I should pay attention to when choosing the other methods. how are they restricted in terms of performance, flexibility and capabilities (eg. multitable loading)? I am also interested in good practices and your experience in similar cases you can share. thank you, geos -- NOTE: Follow Up set to comp.databases.oracle.misc |
#10
| |||
| |||
|
|
On Nov 4, 12:51 am, geos<g... (AT) nowhere (DOT) invalid> wrote: there is a big text file with dirty data. a company wants it to be clean. there are some known patterns expressed as like or regexp. I first thought about two approaches: 1) do this on the system level 2) or in a database for the latter case it looks to me that I could use external tables or load data into temporary table and then do the cleaning. I am looking for pros and cons of each variant. my intuition tells me that loading into temporary table would give the most flexibility but also take additional space. I am not sure about the other methods. I would appreciate your opinion about what I should pay attention to when choosing the other methods. how are they restricted in terms of performance, flexibility and capabilities (eg. multitable loading)? I am also interested in good practices and your experience in similar cases you can share. |
|
After more than a decade of experience my advice to you is: Use Oracle as little as possible. I wrote all my business logic in C/C++ making calls to the database only as needed, and now my applications run much, much, much, much faster. Not to mention the improved development and debug (can use a debugger, not sure whether Oracle has something similar). In essence, the only commands that I run in the database are basic ones such as SELECT and UPDATE. No IFs or BUTs. |
![]() |
| Thread Tools | |
| Display Modes | |
| |