![]() | |
#31
| |||
| |||
|
|
On Thu, 28 Jan 2010 14:08:20 +0100, Laurenz Albe wrote: *assume that the reason for this is that in the old days, file systems used to be much worse than they are now, with performance penalties on directories with many files in them and the like. The company wants to move a data warehouse from an Oracle RAC to PostgreSQL. The database on RAC is essentially one gigantic table, having 400 million records, partitioned into month-sized chunks and searched via text indexes. The pilot project loaded 3 months worth of partitions into PgSQL and developers are now writing a little Groovy Grails application to search it. I am just the DBA here. The size of the DW is going to exceed 4TB. |
#32
| |||
| |||
|
|
Well, it's a brave decision on part of you management. Whilst Postgresql is not a bad choice in many cases, datawarehousing is not one of them. |
|
For some more or less objective comparison, you may want to run a 100GB scale TPC-H benchmark on the same hardware and compare the results. Now, many would claim that TPC-H is not representative of the "real world queries", but the queries are in fact nothing more but a bunch of joins and aggregations on a pretty simple schema. If the results do not convince the management, nothing will. In our experiments with Pg, the latter was about 6 times slower on our typical queries than the existing Oracle DB. |
#33
| |||
| |||
|
|
On Wed, 17 Mar 2010 06:04:42 -0700, V.J. Kumar wrote: Well, *it's a brave decision on part of you management. *Whilst Postgresql is not a bad choice in many cases, *datawarehousing is not one of them. * This data warehouse is specific, it doesn't include joins or aggregation. It is essentially a huge document repository which would allow searches. |
| --http://mgogala.byethost5.com |
#34
| |||
| |||
|
|
On Mar 17, 12:13Â*pm, Mladen Gogala <n... (AT) email (DOT) here.invalid> wrote: On Wed, 17 Mar 2010 06:04:42 -0700, V.J. Kumar wrote: Well, Â*it's a brave decision on part of you management. Â*Whilst Postgresql is not a bad choice in many cases, Â*datawarehousing is not one of them. This data warehouse is specific, it doesn't include joins or aggregation. It is essentially a huge document repository which would allow searches. Ah, it may be usable then. We've experienced a 6x slowdown even on simple aggregations with a 25mil rows table (about 2.5 GB). Event count(*) was about 4 times slower with the same table on the same hardware. In both experiments, the table was fully cached, so no IO penalty incurred. |
![]() |
| Thread Tools | |
| Display Modes | |
| |