![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hi All, I had posted this questions related to scalability design constraints on alt.comp.databases as I thought the questions are generic in nature. I still haven't received any responses. I'm posting to this news group since I'm using Sybase database for your inputs. Thanks, krishna ----- Original Message ----- From: "krishna sathyanarayana" <krishnas (AT) nospam (DOT) com Newsgroups: alt.comp.databases Sent: Thursday, February 19, 2004 11:34 PM Subject: Question on number of records in a database Hello Database Gurus, Sorry if my following questions on scalability/performance of database is a repeat. I would like to know how big a table can be in a database such as Sybase7.0 before I start seeing performance problems. Is there any such restriction. Also, if I have many tables of this size, how would be the system behave with respect to performance. Specifically, does the system performance depend on size the table on which I'm doing SQL query or total size of all the tables? I know this is an open ended question but following is the reason why I'm asking this. In our current implementation of database we a separate table for each model of a product. If there is a new functionality added to a product, schema changes are required to add new columns requiring our product to be tested for many upgrade paths increasing the development time and this also involves restarting of database during upgrade. We can remove this if we don't have a separate table for each model but by storing in one table as property name and value pairs. But, if we do this, the size of the table could become very large. Will there be any performance problem if the table size becomes large? Also, what do you think about collapsing many tables to a single table with property name and value pairs. Is it elegant design? .. Is this approach generally followed? Any pointers to any of the above is greatly appreciated. Thanks in advance, krishna |
#3
| |||
| |||
|
|
Krishna, Think product first. The table design follows. One should use an object-oriented approach to get from attributes of product to table defintions. Strictly from an OO point of view, if all your products have a common set of attributes, those common set of attributes would be the attributes of your "base" product. They are the fixed attributes. These attributes translate directly to columns in a single "base" table. If you have a variation of the product, then you would define only the variat attributes in a different table. The entire definition of the variant table is a join between the "base" table and the variation table. You have a single table for each different variation. If the variations only different by a couple of attributes, you can consider collapsing the columns into the single table. These columns would be NULL columns since a "base" table would not have values for them. As far as how many tables and how many rows in a table can Sybase handle, unless you have a few billion products, there will no problem. Usually, performance is a function of how well you have defined your indexes,. Performance is also what your system requires in the way of performance. For example, Sybase and MS SQL Server are excellent at inserting millions of rows into tables with "clustered indexes." After about several million inserts, then you will start seeing a slow down. Oracle, on the other hand, completely sucks at "index organized tables". You will see an unforgiveable slow down after about 20,000 rows. In Oracle, one is forced to use a non-clustered index -- which means the overall table size needlessly grows by (rows * size of all index columns). You can insert 1,000,000 rows into a Sybase clustered index table in 30 minutes. It literally takes days to do this with the equivalent Oracle "index-organized" table. If you are updating columns a lot, that is a different aspect of performance. All databases prefer fixed data types for columns. For example, they prefer char() over varchar(). for updates. Some databases are far worse at updating than others. I am talking about Oracle. Oracle spells disaster when it comes to update performance. In Oracle, even the basic NUMBER types are variable-length columns. In Oracle, when you update a column with a larger value than the previous value, Oracle splits the row, and makes a chain to the one column value that is now larger. A single column value will reside on a different section of disk apart from the rest of the row. This is a performance killer when reading the entire row later. Oracle has to collect all the columns of a single row from all over the disk. Sybase and MS SQL Server, on the other hand, move the entire row upon such an update. Also, in Sybase and MS SQL Server, all number datatypes are fixed length--hence no movement of the row when updating such a column. While Sybase's "deferred updated" is a performance hit on the initial update, subsequent reads are 3x faster. And 99.9% of the time, there is far more reading done than updating.. Brian http://www.dbpowersuite.com "krishna sathyanarayana" <krishnas (AT) cisco (DOT) com> wrote in message news:1077596040.656521 (AT) sj-nntpcache-3 (DOT) .. Hi All, I had posted this questions related to scalability design constraints on alt.comp.databases as I thought the questions are generic in nature. I still haven't received any responses. I'm posting to this news group since I'm using Sybase database for your inputs. Thanks, krishna ----- Original Message ----- From: "krishna sathyanarayana" <krishnas (AT) nospam (DOT) com Newsgroups: alt.comp.databases Sent: Thursday, February 19, 2004 11:34 PM Subject: Question on number of records in a database Hello Database Gurus, Sorry if my following questions on scalability/performance of database is a repeat. I would like to know how big a table can be in a database such as Sybase7.0 before I start seeing performance problems. Is there any such restriction. Also, if I have many tables of this size, how would be the system behave with respect to performance. Specifically, does the system performance depend on size the table on which I'm doing SQL query or total size of all the tables? I know this is an open ended question but following is the reason why I'm asking this. In our current implementation of database we a separate table for each model of a product. If there is a new functionality added to a product, schema changes are required to add new columns requiring our product to be tested for many upgrade paths increasing the development time and this also involves restarting of database during upgrade. We can remove this if we don't have a separate table for each model but by storing in one table as property name and value pairs. But, if we do this, the size of the table could become very large. Will there be any performance problem if the table size becomes large? Also, what do you think about collapsing many tables to a single table with property name and value pairs. Is it elegant design? . Is this approach generally followed? Any pointers to any of the above is greatly appreciated. Thanks in advance, krishna |
![]() |
| Thread Tools | |
| Display Modes | |
| |