![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
I currently have 6 message boards) that are identical in structure, but have different data. Each board has 2 tables; one for "subjects", and one for "posts". Each of the "posts" tables currently have approximately 1 million rows, and during peak hours, it's not uncommon to have 500+ queries at once. I'm about to launch 100 new boards. Within the next year, I expect each of them to have at least a few hundred thousand rows, which brings my total "posts" data to roughly 18 million rows; within the next 5 years, I would expect it to be > 1 billion rows. Knowing this, would you guys recommend that I create these new boards with 100 tables with identical structure (that would be, 100 "subjects" and 100 "posts"), or is there an advantage (in speed and efficiency) to have them all in a single table, with a new column that defines which board that row belongs to? I had originally planned to have them all separate so that, in the future, if I need to move some to a different server then it would be simple and easy. But in retrospect, it's possible that running this database more efficiently would eliminate (or, at least minimize) that need to move. TIA, Jason |
#3
| |||
| |||
|
|
On Jan 3, 7:17*pm, jwcarlton <jwcarl... (AT) gmail (DOT) com> wrote: I currently have 6 message boards) that are identical in structure, but have different data. Each board has 2 tables; one for "subjects", and one for "posts". Each of the "posts" tables currently have approximately 1 million rows, and during peak hours, it's not uncommon to have 500+ queries at once. I'm about to launch 100 new boards. Within the next year, I expect each of them to have at least a few hundred thousand rows, which brings my total "posts" data to roughly 18 million rows; within the next 5 years, I would expect it to be > 1 billion rows. Knowing this, would you guys recommend that I create these new boards with 100 tables with identical structure (that would be, 100 "subjects" and 100 "posts"), or is there an advantage (in speed and efficiency) to have them all in a single table, with a new column that defines which board that row belongs to? I had originally planned to have them all separate so that, in the future, if I need to move some to a different server then it would be simple and easy. But in retrospect, it's possible that running this database more efficiently would eliminate (or, at least minimize) that need to move. TIA, Jason As as DBA you need to consider more than just the logical aspect of your database. *Can the overall SYSTEM handle the workload? Again, that depends on lots of different factors mentioned below (and more). At the rate you quoted (500queries "at once") for one bbs, you may very well run into I/O issues long before you reach the target number of bbs's depending upon your hardware. *Is it a PC/Desktop class or a Server class system... *You may run out of horsepower before you get to 100/100 of anything. This, of course, is dependent upon what **you** consider "at once" *500 queries/sec /min? /hr? *Do you have any way to test 50000 queries "at once"? *Once you have this in place, how big are the rows (average)? *Will it take a few GB or a few hundred GB or > *TB to store the data you have outlined. If you were to go with the one table with many boards, look at partitioning that table by that new column and storing each partition (data AND index) on separate disks. *This will improve the parallelism. *Even with RAID5 or RAID0+1, you could very well run into an I/O channel bandwidth issue, again, depending on the "system". If possible use a SAN with fiber-channel to maximize your performance. Look athttp://dev.mysql.com/tech-resources/articles/mysql_5.1_partitions.html example: create table... * PARTITION BY RANGE(id) * * *( PARTITION p0 VALUES LESS THAN(250000) ( * * * * * INDEX DIRECTORY = '/u01/orders/i_p0' * * * * * DATA DIRECTORY *= '/u02/orders/d_p0' * * *), * * * PARTITION p1 VALUES LESS THAN(500000) ( * * * * * INDEX DIRECTORY = '/u03/orders/i_p1' * * * * * DATA DIRECTORY *= '/u04/orders/d_p1', * * * ), * * * PARTITION p2 VALUES LESS THAN(MAXVALUE) ( * * * * * INDEX DIRECTORY = '/u05/orders/i_p2' * * * * * DATA DIRECTORY *= '/u06/orders/d_p2' * * * ) etc.... ); Remember, your spindles have only one set of heads and they can only be in one place at a time so "partitioning" the disk device(s) is absolutely worthless. *Next, how are you going to back this thing up? You need to begin to think about recoverability of this data. You have only just begun... |
![]() |
| Thread Tools | |
| Display Modes | |
| |