![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Say I have a table with say 5 columns being (txn_date date, id integer, status integer, value dec(8,2) symbol integer) Table contains some 30m records. I then have 2 indexes being : create unique index a (txn_date, status, id) include (value, symbol) create index b (id, status, txn_date) |
|
The date will have far more id's per date, than dates per id's and I mean FAR more. Perhaps a max of 50 dates per id, whereas many thousands of ids per date. I am not inserting new id's but I am inserting via single daily stored procedure more dates per id. How do I determine which is better to apply a CLUSTER to for REORG purposes, and how do I determine the optimum order of the fields ? Also how do I determine PCTFREE and MINPCTUSED values for each ? |
#3
| |||
| |||
|
|
Not sure I understand your issue with clustering. To determine the cluster index I would look at the queries executed. Correct clustering avoids sorting, but without knowing what the queries are it's difficult to determine a clustering order. |
#4
| |||
| |||
|
#5
| |||
| |||
|
|
Problem is I have literally hundreds of queries to the table in question.I've been looking at snapshots and questioning the number of rows read, versus the number of rows selected, where there is a massive difference and Imean HUGE. Now I read recently some articles on index structure and the 3 tiers of pages within being Root, Intermediate and Leaf pages. From what I now understand, When constructing an index, it is better to create multi-column indexes. Which is perfectly understandable and I was always doing so. My problemis which order to create the index columns to avoid leaf scans. The issue is I have queries that use the ID and date columns independently so assigning one or the other as the first column will always result in leaf scans unless I create more than 1 multi-column index with differing orders. As in (ID, DATE ....) and another one with (DATE, ID ...). Here I now have to decide where to place the CLUSTER in which of the two indexes. If I only have one index (ID, DATE ...) then any query that has a predicate of WHERE DATE = etc results in leafs scans, so I have to create more than one to avoid such. Most of REALLY important queries use the DATE to join to other tables andthen use a predicate of WHERE ID = x . Now given the nature of the data in the table with thousands of IDs per date, and say only 50 dates per ID, I'm trying to understand where it is better to place a CLUSTER clause and use the other index clauses such as PCTFREE and MINPCTUSED, PAGE SPLIT HIGH (as most sorts are ASC). We are inserting say 50k records daily for new DATEs to existing IDs. I have another table that I use for PERIOD IDs. 1 record per business day, with 5 dates in sequence and an ID. ID, D1_DATE, D2_DATE, D3_DATE, D4_DATE, D5_DATE. I use this table to apply sequence to existing records by joining the DATE column of a table to say D2_DATE, I can query other date tables by joining on D1_DATE to find records ocurring in sequence. This Period table however gets massive rows read values for very few records selected, and I am not sure I understand why. This table only has some 3200 records in it but rows read in the hundreds of millions when joining to other date tables. Here again I have the issue or how to create the indexfor if I start with (d1_date, d2_date ... ) but have to join on the d3_date we again get the leaf scans. So am I better off creating multi-column indexes with all combinations and where do I add a cluster clause ? Very confusing for a non-technical self taught pleb like myself. Sorry for the essay but you can see my dilemma. |
#6
| ||||
| ||||
|
|
Problem is I have literally hundreds of queries to the table in question. |
|
Now I read recently some articles on index structure and the 3 tiers of pages within being Root, Intermediate and Leaf pages. From what I now understand, When constructing an index, it is better to create multi-column indexes. Which is perfectly understandable and I was always doing so. My problem is which order to create the index columns to avoid leaf scans. The issue is I have queries that use the ID and date columns independently so assigning one or the other as the first column will always result in leaf scans unless I create more than 1 multi-column index with differing orders. |
|
As in (ID, DATE ....) and another one with (DATE, ID ...). Here I now have to decide where to place the CLUSTER in which of the two indexes. If I only have one index (ID, DATE ...) then any query that has a predicate of WHERE DATE = etc results in leafs scans, so I have to create more than one to avoid such. Most of REALLY important queries use the DATE to join to other tables and then use a predicate of WHERE ID = x . Now given the nature of the data in the table with thousands of IDs per date, and say only 50 dates per ID, I'm trying to understand where it is better to place a CLUSTER clause and use the other index clauses such as PCTFREE and MINPCTUSED, PAGE SPLIT HIGH (as most sorts are ASC). We are inserting say 50k records daily for new DATEs to existing IDs. I have another table that I use for PERIOD IDs. 1 record per business day, with 5 dates in sequence and an ID. ID, D1_DATE, D2_DATE, D3_DATE, D4_DATE, D5_DATE. I use this table to apply sequence to existing records by joining the DATE column of a table to say D2_DATE, I can query other date tables by joining on D1_DATE to find records ocurring in sequence. This Period table however gets massive rows read values for very few records selected, and I am not sure I understand why. This table only has some 3200 records in it but rows read in the hundreds of millions when joining to other date tables. Here again I have the issue or how to create the index for if I start with (d1_date, d2_date ... ) but have to join on the d3_date we again get the leaf scans. So am I better off creating multi-column indexes with all combinations and where do I add a cluster clause ? Very confusing for a non-technical self taught pleb like myself. Sorry for the essay but you can see my dilemma. |
![]() |
| Thread Tools | |
| Display Modes | |
| |