![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
I'm rather new to MySQL Cluster, so let me ask a qustion that might seem a noobie one ![]() Say, I have a cluster consisting of 4 data nodes, with 2 replicas -- so there are 2 cluster groups, like here: |
|
So when I'm making an INSERT query via MySQL node, how does that data spreads through all partitions? |
|
Does it happen simultaneously? |
#3
| |||
| |||
|
|
vadim <samokhinva... (AT) gmail (DOT) com> wrote: I'm rather new to MySQL Cluster, so let me ask a qustion that might seem a noobie one ![]() Say, I have a cluster consisting of 4 data nodes, with 2 replicas -- so there are 2 cluster groups, like here: It's called "node groups" So when I'm making an INSERT query via MySQL node, how does that data spreads through all partitions? If you insert a single row, then this row goes to exactly one partition of the cluster table. Which one, depends on the partitioning key and the value(s) that the new row has in the key column(s). Now since you have replicas=2, there are 2 nodes in the cluster that have this partition. Or - using MySQL cluster terminology - all nodes in a certain node group have it. Those nodes receive and store the new row. Does it happen simultaneously? From an outer view, yes. Internally this happens: the INSERT is part of a transaction. For this transaction one data node was chosen (or is chosen now) to act as transaction coordinator (TC). The SQL node sends the new row to the TC (using the cluster internal signaling mechanism). The TC determines which data nodes hold the respective partition and sends the row to them. The SQL node only talks to the TC. As soon as the TC acknowledges that it has the new row, everything is fine for the SQL node. COMMITting is a different story. This requires a full roundtrip for all participating data nodes (actually two: one for phase 1 and another for phase 2). Look out for the "MySQL CLuster: The Complete Turorial" and/or "Design and Internals of MySQL Cluster" papers from Stewart Smith. XL |
Thought something wrong with Google Groups because I could't see
So, once again:
#4
| ||||
| ||||
|
|
Oops, seems I replied to you by email twice, sorry, I didn't mean to ![]() |
|
Have you used MySQL Cluster in production environment and what is your |
|
opinion about it comparing to common MySQL? |
|
What problems have you run into and how have you got along? |
#5
| |||
| |||
|
|
Hi Vadim, vadim <samokhinva... (AT) gmail (DOT) com> wrote: Oops, seems I replied to you by email twice, sorry, I didn't mean to ![]() Oops, looks like my SPAM filter ate it. I hadn't answered anyway. Have you used MySQL Cluster in production environment and what is your I am working with MySQL Cluster on a professional basis, yes. opinion about it comparing to common MySQL? It's a completely different beast. If you try using NDB as a drop-in replacement for i.e. InnoDB, then you will be disappointed. Maybe it was a bad idea to call NDB "MySQL Cluster" because anybody imagines a traditional cluster of an oldfashioned SQL engine with some shared storage. NDB is different! NDB was designed as shared-nothing, in-memory, realtime database engine. It distributes on multiple nodes not only for HA, but also for performance (scale out). It is tremendously fast for simple operations like index lookups. It's pretty bad for complex queries, like JOINs or full table aggregates. If you intend to use NDB through the SQL layer, then you can not easily leverage full performance. For a typical 4-node NDB cluster you will need at least 4 SQL nodes and distribute your load. Using NDBAPI in your application (for direct access to NDB) will yield a 3x up to 10x increase in performance. Using the asynchronous variant of NDBAPI will give another factor of ~2. What problems have you run into and how have you got along? NDB is a diva. You must know what kind of workload you are going to apply and configure NDB accordingly. If you misconfigure it or if you overload it, it may simply crash (although mostly with a message along "running out of xxx buffer, increase xxx-buffer-size") But if you configure NDB accordingly and use it for what it was designed for, then you can get really impressive performance: http://johanandersson.blogspot.com/2...r-performance-... XL |
#6
| |||
| |||
|
|
Hi Vadim, vadim <samokhinva... (AT) gmail (DOT) com> wrote: Oops, seems I replied to you by email twice, sorry, I didn't mean to ![]() Oops, looks like my SPAM filter ate it. I hadn't answered anyway. Have you used MySQL Cluster in production environment and what is your I am working with MySQL Cluster on a professional basis, yes. opinion about it comparing to common MySQL? It's a completely different beast. If you try using NDB as a drop-in replacement for i.e. InnoDB, then you will be disappointed. Maybe it was a bad idea to call NDB "MySQL Cluster" because anybody imagines a traditional cluster of an oldfashioned SQL engine with some shared storage. NDB is different! NDB was designed as shared-nothing, in-memory, realtime database engine. It distributes on multiple nodes not only for HA, but also for performance (scale out). It is tremendously fast for simple operations like index lookups. It's pretty bad for complex queries, like JOINs or full table aggregates. If you intend to use NDB through the SQL layer, then you can not easily leverage full performance. For a typical 4-node NDB cluster you will need at least 4 SQL nodes and distribute your load. Using NDBAPI in your application (for direct access to NDB) will yield a 3x up to 10x increase in performance. Using the asynchronous variant of NDBAPI will give another factor of ~2. What problems have you run into and how have you got along? NDB is a diva. You must know what kind of workload you are going to apply and configure NDB accordingly. If you misconfigure it or if you overload it, it may simply crash (although mostly with a message along "running out of xxx buffer, increase xxx-buffer-size") But if you configure NDB accordingly and use it for what it was designed for, then you can get really impressive performance: http://johanandersson.blogspot.com/2...r-performance-... XL |
#7
| |||||
| |||||
|
|
On Nov 9, 12:46 pm, Axel Schwenke <axel.schwe... (AT) gmx (DOT) de> wrote: NDB was designed as shared-nothing, in-memory, realtime database engine. It distributes on multiple nodes not only for HA, but also for performance (scale out). It is tremendously fast for simple operations like index lookups. It's pretty bad for complex queries, like JOINs or full table aggregates. |
|
... if you configure NDB accordingly and use it for what it was designed for, then you can get really impressive performance: So I should use it with simple queries, ok. But it's hard to imagine a web application without joins |
|
otherwise there will be a lot of simple queries. For example I need to output a list of posts with the authors' names. |
|
So my first query will be the list of posts, and the others -- one query per post for author retrieve. Is that a MySQL Cluster way? |
|
And what about cache, by the way? If MySQL Cluster is in-memory db engine, should I use Memcache or anything else? |
#8
| |||
| |||
|
|
vadim <samokhinvadim (AT) gmail (DOT) com> wrote: otherwise there will be a lot of simple queries. For example I need to output a list of posts with the authors' names. Another "web forum" software? Don't we have too many of those already? And what's wrong with usenet anyway? |
![]() |
| Thread Tools | |
| Display Modes | |
| |