![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hello, Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit I have a table that will contain around 300 million rows so space utilization is a primary criterion for the design of the table. One of the columns will be loaded with a variable length numeric value from 1 to 9999999999 that will contain leading zeroes. Since this numeric column value is not used in any computations, if space were not so paramount, I would normally define it as varchar2(10) but we are considering defining it as NUMBER (realizing that we will lose the leading zeroes when the column is populated). My understanding is that the NUMBER datatype will generally require less space than the VARCHAR2 datatype but is defining the column as NUMBER a reasonable design compromise? TIA Luke |
#3
| |||
| |||
|
|
Hello, Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit I have a table that will contain around 300 million rows so space utilization is a primary criterion for the design of the table. One of the columns will be loaded with a variable length numeric value from 1 to 9999999999 that will contain leading zeroes. Since this numeric column value is not used in any computations, if space were not so paramount, I would normally define it as varchar2(10) but we are considering defining it as NUMBER (realizing that we will lose the leading zeroes when the column is populated). My understanding is that the NUMBER datatype will generally require less space than the VARCHAR2 datatype but is defining the column as NUMBER a reasonable design compromise? TIA Luke |
#4
| |||
| |||
|
|
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit I have a table that will contain around 300 million rows so space utilization is a primary criterion for the design of the table. One of the columns will be loaded with a variable length numeric value from 1 to 9999999999 that will contain leading zeroes. |
|
Since this numeric column value is not used in any computations, if space were not so paramount, I would normally define it as varchar2(10) but we are considering defining it as NUMBER (realizing that we will lose the leading zeroes when the column is populated). |
#5
| |||
| |||
|
|
On 2011-05-04 01:26, Luke <luke_airig (AT) hotmail (DOT) com> wrote: Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit I have a table that will contain around 300 million rows so space utilization is a primary criterion for the design of the table. One of the columns will be loaded with a variable length numeric value from 1 to 9999999999 that will contain leading zeroes. What does "variable length ... that will contain leading zeroes" mean exactly: 1) The value without leading zeroes is variable length, leading zeroes are used to pad to 10 digits ("0000000042" is a legal value, but "00042" is not). 2) The length including leading zeroes is variable, but the leading zeroes are not significant ("0000000042" and "00042" could both occur but are considered the same value) 3) The length including leading zeroes is variable and all digits are significant ("0000000042" and "00042" can both occur and are different values). Since this numeric column value is not used in any computations, if space were not so paramount, I would normally define it as varchar2(10) but we are considering defining it as NUMBER (realizing that we will lose the leading zeroes when the column is populated). This will only really work in the case 1 above. It will sort of work in case 2 (i.e. you can still identify your records but users may be irritated if "the computer eats their zeroes"). It will not work at all in case 3. You can, however, make it work in case 3 by prepending a fixed digit. So you store "0000000042" as 10000000042 and "00042" as 100042. For display purposes you strip the first digit. This will add 0.5 bytes per row on average. Alternatively you could use a varchar2 and store each pair of digits in a single ASCII character. That would shave off another byte compared to storing it as number, but has the disadvantage that you have cryptic strings instead of readable numbers in your database - probably not worth the bother. hp |

#6
| |||
| |||
|
|
Leading zeros... Think everyone haves them. Rockhound, |
#7
| |||
| |||
|
|
On 5/8/2011 11:08 PM, Rockhound57 wrote: Leading zeros... Think everyone haves them. Rockhound, Did you meant to say "everyone *hates* them"? If so, I'm in complete agreement. |

|
Luke, Unless you are working with legacy data where (for instance) you need to distinguish between 42 042 and 000042, everything will work much much better if you just use the Number datatype. Numeric-like data with leading zeroes is a notorious bug-inducer - in particular, Microsoft Excel has fits whenever it encounters numeric-like data with leading zeroes. |

|
If you can design them out of the spec now, you will save yourself and your colleagues many many frustrating hours of coding around the leading zeroes. There are space and speed advantages of going with the Number datatype as well, but they pale in comparison to the frustration you'll encounter when the system *breaks* rather than just being slow or taking up a lot of space. //Walt |
#8
| |||
| |||
|
|
Hello, Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit I have a table that will contain around 300 million rows so space utilization is a primary criterion for the design of the table. One of the columns will be loaded with a variable length numeric value from 1 to 9999999999 that will contain leading zeroes. Since this numeric column value is not used in any computations, if space were not so paramount, I would normally define it as varchar2(10) but we are considering defining it as NUMBER (realizing that we will lose the leading zeroes when the column is populated). My understanding is that the NUMBER datatype will generally require less space than the VARCHAR2 datatype but is defining the column as NUMBER a reasonable design compromise? TIA |
#9
| |||
| |||
|
|
Luke <luke_airig (AT) hotmail (DOT) com> writes: Hello, Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit I have a table that will contain around 300 million rows so space utilization is a primary criterion for the design of the table. One of the columns will be loaded with a variable length numeric value from 1 to 9999999999 that will contain leading zeroes. Since this numeric column value is not used in any computations, if space were not so paramount, I would normally define it as varchar2(10) but we are considering defining it as NUMBER (realizing that we will lose the leading zeroes when the column is populated). My understanding is that the NUMBER datatype will generally require less space than the VARCHAR2 datatype but is defining the column as NUMBER a reasonable design compromise? TIA While I can understand your concern, I think it may be misplaced. It is likely that if you have a requirement to have numbers with leading 0, these are not really number, but identifiers (such as a staff number or similar). In this case, you are really dealing with a string of characters that just happen to all be digits. Now, one of the advantages of varchar2 is that the column will only use the space it requires while number columns will use a fixed size. So, the question is, will the average length of your varchar2 columns be larger than the fixed width number columns? But I don't think that is even the major point and suspect the differences will, in the scale of things, be minimal. What is possibly more crucial is how the value is going to be processed/used. For example, if the leading zeros are important, maybe you will end up with lots of to_char conversions to pad with leading zeros or maybe you will need to doing joins with other tables where the data is represented as varchar2 etc. Sorry this isn't really answering the question, but the problem is you cannot give an adequate answer without a lot more information regarding the application and how the value is used. One thing I can say, I've lost count of the number of times developers have used a number column to represent a value when it should have been represented as a varchar2 and was only defined as being a number because it represented some attribute that just happened to have the word number in its name, such as staff number. My rule of thumb is that if the value is not going to be used in mathematical calculations and your in doubt, make it a varchar2. Tim |
#10
| |||
| |||
|
|
On 14-06-2011 10:44, Tim X wrote: Luke <luke_ai... (AT) hotmail (DOT) com> writes: Hello, Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit I have a table that will contain around 300 million rows so space utilization is a primary criterion for the design of the table. One of the columns will be loaded with a variable length numeric value from 1 to 9999999999 that will contain leading zeroes. Since this numeric column value is not used in any computations, if space were not so paramount, I would normally define it as varchar2(10) but we are considering defining it as NUMBER (realizing that we will lose the leading zeroes when the column is populated). My understanding is that the NUMBER datatype will generally require less space than the VARCHAR2 datatype but is defining the column as NUMBER a reasonable design compromise? TIA While I can understand your concern, I think it may be misplaced. It is likely that if you have a requirement to have numbers with leading 0, these are not really number, but identifiers (such as a staff number or similar). In this case, you are really dealing with a string of characters that just happen to all be digits. Now, one of the advantages of varchar2 is that the column will only use the space it requires while number columns will use a fixed size. So, the question is, will the average length of your varchar2 columns be larger than the fixed width number columns? But I don't think that is even the major point and suspect the differences will, in the scale of things, be minimal. What is possibly more crucial is how the value is going to be processed/used. For example, if the leading zeros are important, maybe you will end up with lots of to_char conversions to pad with leading zeros or maybe you will need to doing joins with other tables where the data is represented as varchar2 etc. Sorry this isn't really answering the question, but the problem is you cannot give an adequate answer without a lot more information regarding the application and how the value is used. One thing I can say, I've lost count of the number of times developers have used a number column to represent a value when it should have been represented as a varchar2 and was only defined as being a number because it represented some attribute that just happened to have the word number in its name, such as staff number. My rule of thumb is that if the value is not going to be used in mathematical calculations and your in doubt, make it a varchar2. Tim If those columns always contain 10 numbers would it make a difference to define the table as CREATE TABLE table ( column1 CHAR(20)); or CREATE TABLE table ( column1 VARCHAR(20)); ? -- Luuk |
![]() |
| Thread Tools | |
| Display Modes | |
| |