![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hi all, I'm looking into Ingres 2006 as a replacement for a MySQL database. I've read through what I can find on the Ingres web site, but I have a number of questions about Ingres that I haven't been able to find the answers to/have been to dense to comprehend from the documentation. Most of these questions are seemingly simple, for example: Does Ingres 2006 support: - Correlated Subqueries? |
|
- Online node addition? - Mixed hardware on nodes? Mixed OSes? Mixed 32-bit/64-bit installs? |
|
Is this an appropriate place to ask about these kinds of questions, or should I be looking/asking elsewhere? |
#3
| |||
| |||
|
|
At 3:11 PM +0100 3/28/06, Andrew Hill wrote: - Online node addition? - Mixed hardware on nodes? Mixed OSes? Mixed 32-bit/64-bit installs? Are you talking about clusters here? I'm not sure that active/active clusters are officially GA yet. (I gather that it's due to ongoing glitches with the available cluster filesystems, none of the free ones seem to be truly production ready. They pretty much mostly work, and that isn't good enough.) |
#4
| |||
| |||
|
|
On Tue, 2006-03-28 at 09:30 -0500, Betty & Karl Schendel wrote: At 3:11 PM +0100 3/28/06, Andrew Hill wrote: - Online node addition? - Mixed hardware on nodes? Mixed OSes? Mixed 32-bit/64-bit installs? Are you talking about clusters here? I'm not sure that active/active clusters are officially GA yet. (I gather that it's due to ongoing glitches with the available cluster filesystems, none of the free ones seem to be truly production ready. They pretty much mostly work, and that isn't good enough.) Yes, I was talking about clusters. The MySQL database I manage is starting to push the limits of what it can do, and horizontal scaling is starting to look attractive. |
#5
| |||
| |||
|
|
"Andrew Hill" <list (AT) fornax (DOT) net> wrote in message news:mailman.1143557582.8013.info-ingres (AT) cariboulake (DOT) com... I was talking about clusters. The MySQL database I manage is starting to push the limits of what it can do, and horizontal scaling is starting to look attractive. You haven't given us any information about volume and job-mix so it is hard to be confident about any advice, but if you are reaching throughput limits I suspect Ingres clusters are a long way from helping you with that even if the filesystems can be made to work. Ingres will do a fine job of using a cluster for failover (i.e. high-availability clusters (HAC)), but the distributed lock management still needs to be proven to scale to support a high-performance cluster (HPC). Personally I would not bet my farm on it, although by the same token I don't know for sure that it doesn't scale either. On the other hand, Ingres does allow up to 30 servers instances to serve the same database, and a practically unlimited number of "net" servers, so perhaps that gives you some room to scale up. What are your performance/workload targets? |
#6
| |||
| |||
|
|
What are your performance/workload targets? I was a bit vague, wasn't I? But aren't you going to do my thinking for me? ;-) The main problem I have with MySQL at the moment is for a quite simple query: INSERT INTO summary_table ( day, hour, operation_interval, operation_interval_id, interval_start, interval_end, ad_id, creative_id, zone_id, impressions ) SELECT DATE_FORMAT(drad.date_time, '%Y-%m-%d') AS day, DATE_FORMAT(drad.date_time, '%k') AS hour, 60 AS operation_interval, 0 AS operation_interval_id, '2006-03-26 00:00:00' AS interval_start, '2006-03-26 00:59:59' AS interval_end, drad.ad_id AS ad_id, drad.creative_id AS creative_id, drad.zone_id AS zone_id, COUNT(*) AS impressions FROM raw_impressions_20060326 AS drad WHERE drad.date_time >= 20060326000000 AND drad.date_time <= 20060326005959 GROUP BY day, hour, ad_id, creative_id, zone_id; As you can see, it's not rocket science; however, with the number of rows the daily raw_impressions_20060326 table (currently about 40 million), MySQL has problems performing the GROUP BY in memory, and so it's slow. (Yes, there's various optimisations in place, etc. but they will only take us so far...) I'd like to be able to split this query up so that it runs on a number of nodes, so that we can add more nodes as the number of rows increases... Is this something that Ingres can do? |
#7
| |||
| |||
|
|
"Andrew Hill" <list (AT) fornax (DOT) net> wrote in message news:mailman.1143713640.6031.info-ingres (AT) cariboulake (DOT) com... What are your performance/workload targets? The main problem I have with MySQL at the moment is for a quite simple query: |
|
I'd like to be able to split this query up so that it runs on a number of nodes, so that we can add more nodes as the number of rows increases... Is this something that Ingres can do? My first impulse is to say "in its sleep", but before I shoot my mouth off, where does the raw_impressions_20060326 table itself come from, and how many queries similar to the above will you want to run on it? Also what is its useful lifespan, and how many distinct sessions need to access it during its lifespan? |
#8
| |||
| |||
|
|
On Thu, 2006-03-30 at 13:11 +0100, Roy Hann wrote: "Andrew Hill" <list (AT) fornax (DOT) net> wrote in message news:mailman.1143713640.6031.info-ingres (AT) cariboulake (DOT) com... What are your performance/workload targets? The main problem I have with MySQL at the moment is for a quite simple query: *snip* I'd like to be able to split this query up so that it runs on a number of nodes, so that we can add more nodes as the number of rows increases... Is this something that Ingres can do? My first impulse is to say "in its sleep"... |
|
(A nice, fast bulk import into the Ingres 2006 cluster would be handy here. Can that be done in // too?) |
#9
| ||||
| ||||
|
|
My first impulse is to say "in its sleep", but before I shoot my mouth off, where does the raw_impressions_20060326 table itself come from, and how many queries similar to the above will you want to run on it? Also what is its useful lifespan, and how many distinct sessions need to access it during its lifespan? Well, the data comes from a collection of web servers. Each has a local instance of MySQL, and we package up our raw_impressions every 5 minutes on each web server, and bulk import them into the central raw_impressions_20060326 table. |
|
(A nice, fast bulk import into the Ingres 2006 cluster would be handy here. Can that be done in // too?) |
|
We run 3 queries similar to the above, but all one after the other, so there's no problems with lots of people wanting to read/write the table - the only other thing that would be (might be) happening at the same time would be the insertion of new data (as per the above process). |
|
All of our reporting/BI work is done from the summary tables - however, the lifespan of the raw daily tables is ~90 days, as we do also do a join on a (smaller) raw table to this table as well. (However, it's quite fast, and I'd be happy limiting that query to run on a single node, to avoid inter-node communications, if it worked better that way...) Still sounding good? |
#10
| |||
| |||
|
|
I am inclined to agree with Roy. Unless the query results in many tens or hundreds of thousands of result groups, I would expect Ingres to hash-aggregate the data without really breaking a sweat. |
![]() |
| Thread Tools | |
| Display Modes | |
| |