![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
I have recently had a conversation with an Oracle support engineer who told me that, when analyzing VARCHAR2 columns, Oracle only counts the first 32 characters. I checked the statement and it is true: SQL> create table test1(col varchar2(40)); Table created. Elapsed: 00:00:00.07 SQL> insert into test1 values('01235678901234567890123456789012A'); 1 row created. Elapsed: 00:00:00.07 SQL> insert into test1 values('01235678901234567890123456789012B'); 1 row created. Elapsed: 00:00:00.08 SQL> commit; Commit complete. Elapsed: 00:00:00.06 SQL> analyze table test1 compute statistics Â* 2 Â*for table for all columns size 254; Table analyzed. Elapsed: 00:00:00.07 SQL> select column_name,num_distinct from user_tab_columns Â* 2 Â*where table_name='TEST1'; COLUMN_NAME Â* Â* Â* Â* Â* Â* Â* Â* Â* Â*NUM_DISTINCT ------------------------------ ------------ COL Â* Â* Â* Â* Â* Â* Â* Â* Â* Â*Â* Â* Â* Â* Â* Â* Â* Â* Â* 1 Elapsed: 00:00:00.11 SQL In other words, if you are storing FS paths into the database and analyzing the table, stats will show significantly smaller number of the distinct values than there really are. Queries may be messed up because of that. I tested on 10.2.0.5 and 11.2.0.1, the behavior is the same. Did anybody else notice this? --http://mgogala.byethost5.com |
#3
| |||
| |||
|
|
On Aug 4, 8:44Â*am, Mladen Gogala <n... (AT) email (DOT) here.invalid> wrote: I have recently had a conversation with an Oracle support engineer who told me that, when analyzing VARCHAR2 columns, Oracle only counts the first 32 characters. I checked the statement and it is true: SQL> create table test1(col varchar2(40)); Table created. Elapsed: 00:00:00.07 SQL> insert into test1 values('01235678901234567890123456789012A'); 1 row created. Elapsed: 00:00:00.07 SQL> insert into test1 values('01235678901234567890123456789012B'); 1 row created. Elapsed: 00:00:00.08 SQL> commit; Commit complete. Elapsed: 00:00:00.06 SQL> analyze table test1 compute statistics Â* 2 Â*for table for all columns size 254; Table analyzed. Elapsed: 00:00:00.07 SQL> select column_name,num_distinct from user_tab_columns Â* 2 Â*where table_name='TEST1'; COLUMN_NAME Â* Â* Â* Â* Â* Â* Â* Â* Â* Â*NUM_DISTINCT ------------------------------ ------------ COL Â* Â* Â* Â* Â* Â* Â* Â* Â* Â* Â* Â* Â* Â* Â* Â* Â* Â* Â* 1 Elapsed: 00:00:00.11 SQL In other words, if you are storing FS paths into the database and analyzing the table, stats will show significantly smaller number of the distinct values than there really are. Queries may be messed up because of that. I tested on 10.2.0.5 and 11.2.0.1, the behavior is the same. Did anybody else notice this? --http://mgogala.byethost5.com Interesting. Tell Jože: http://joze-senegacnik.blogspot.com/...-in-determing- selectivity.html Same results as yours on 10.2.0.4 jg |
#4
| |||
| |||
|
|
I have recently had a conversation with an Oracle support engineer who told me that, when analyzing VARCHAR2 columns, Oracle only counts the first 32 characters. I checked the statement and it is true: |
|
-- http://mgogala.byethost5.com |
#5
| |||
| |||
|
|
Thanks! I didn't know that Joze had a blog and we have worked together in the past. Joze Senegacnik is a great guy, a member of the furniture store named "Oak Table" or something like that. *Apparently, Joze did a bit more disciplined and systematic testing than me. This, however, is an interesting anomaly. BTW, I got Tom's book, I am busy reading it, so I may miss an interesting blog or two or even some interesting threads on this group. |
#6
| |||
| |||
|
|
I've done some time ago a general discussion on the limitations of histograms with DBMS_STATS, since the issue is not limited to character columns: http://oracle-randolf.blogspot.com/2...of-histograms-... |
#7
| |||
| |||
|
|
I have recently had a conversation with an Oracle support engineer who told me that, when analyzing VARCHAR2 columns, Oracle only counts the first 32 characters. I checked the statement and it is true: SQL> create table test1(col varchar2(40)); Table created. Elapsed: 00:00:00.07 SQL> insert into test1 values('01235678901234567890123456789012A'); 1 row created. Elapsed: 00:00:00.07 SQL> insert into test1 values('01235678901234567890123456789012B'); 1 row created. Elapsed: 00:00:00.08 SQL> commit; Commit complete. Elapsed: 00:00:00.06 SQL> analyze table test1 compute statistics 2 for table for all columns size 254; Table analyzed. Elapsed: 00:00:00.07 SQL> select column_name,num_distinct from user_tab_columns 2 where table_name='TEST1'; COLUMN_NAME NUM_DISTINCT ------------------------------ ------------ COL 1 Elapsed: 00:00:00.11 SQL In other words, if you are storing FS paths into the database and analyzing the table, stats will show significantly smaller number of the distinct values than there really are. Queries may be messed up because of that. I tested on 10.2.0.5 and 11.2.0.1, the behavior is the same. Did anybody else notice this? |
![]() |
| Thread Tools | |
| Display Modes | |
| |