dbTalk Databases Forums  

Index after the fact?

comp.databases.mysql comp.databases.mysql


Discuss Index after the fact? in the comp.databases.mysql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
SpreadTooThin
 
Posts: n/a

Default Index after the fact? - 11-25-2011 , 03:35 PM






My original schema / database / table create script did not include all my necessary keys.

Is it too late to know specify new keys?
Right now there is only one primary key.

I've noticed that certain queries are very slow when selecting on a column that isn't a key.

So my question is can a new key be added now?
Will that destroy any contents of the current table?
Will that speed up the results of select statements on columns that aren't keys?

TIA.
B.

Reply With Quote
  #2  
Old   
Doug Miller
 
Posts: n/a

Default Re: Index after the fact? - 11-25-2011 , 03:54 PM






On 11/25/2011 4:35 PM, SpreadTooThin wrote:
Quote:
My original schema / database / table create script did not include all my necessary keys.

Is it too late to know specify new keys?
Right now there is only one primary key.

I've noticed that certain queries are very slow when selecting on a column that isn't a key.

So my question is can a new key be added now?
http://dev.mysql.com/doc/refman/5.0/...ate-index.html

Quote:
Will that destroy any contents of the current table?
No.

Quote:
Will that speed up the results of select statements on columns that aren't keys?
If you build the index(es) properly, yes.

Reply With Quote
  #3  
Old   
SpreadTooThin
 
Posts: n/a

Default Re: Index after the fact? - 11-25-2011 , 04:07 PM



On Nov 25, 2:54*pm, Doug Miller <doug_at_milmacdot... (AT) example (DOT) com>
wrote:
Quote:
On 11/25/2011 4:35 PM, SpreadTooThin wrote:

My original schema / database / table create script did not include allmy necessary keys.

Is it too late to know specify new keys?
Right now there is only one primary key.

I've noticed that certain queries are very slow when selecting on a column that isn't a key.

So my question is can a new key be added now?

http://dev.mysql.com/doc/refman/5.0/...ate-index.html

Will that destroy any contents of the current table?

No.

Will that speed up the results of select statements on columns that aren't keys?

If you build the index(es) properly, yes.
And of course new insert will automatically update the new indecies
(indexes) as well?

Reply With Quote
  #4  
Old   
Doug Miller
 
Posts: n/a

Default Re: Index after the fact? - 11-25-2011 , 04:30 PM



On 11/25/2011 5:07 PM, SpreadTooThin wrote:

Quote:
And of course new insert will automatically update the new indecies
*indices
(indexes) as well?
Yes.

Reply With Quote
  #5  
Old   
Gordon Burditt
 
Posts: n/a

Default Re: Index after the fact? - 11-25-2011 , 10:16 PM



Quote:
My original schema / database / table create script did not include all my necessary keys.

Is it too late to know specify new keys?
No. Check out commands such as:
CREATE INDEX ...
ALTER TABLE foo ADD INDEX (last_name, first_name);
ALTER TABLE foo ADD UNIQUE (last_name, first_name);
ALTER TABLE foo ADD PRIMARY KEY (member_number);
ALTER TABLE foo DROP PRIMARY KEY;
ALTER TABLE foo DROP INDEX ...

If you have less than optimal indexes, you may want to drop the old
ones and add new ones (which can be done in one large ALTER TABLE).

Quote:
Right now there is only one primary key.
You are permitted to have at most one primary key, for much the
same reason that if you have twins, one child is considered the
primary child and the other one is only a twin. Ok, so that analogy
doesn't work so well. "primary key" has a vague connotation of
"most important key", which is why there can be only one. If you
want a different primary key, drop the old one first, then add a
new one.

You may have, however, multiple unique indexes. Each serves two
purposes: speeding lookups and preventing duplicates. (If you
want to allow duplicates, use a non-unique index). Only one of
them can be the primary key, if for no other reason that indexes
have names and only one of them can be named 'PRIMARY'.

I'm sure there is a limit on the number of indexes you can have but
unless you go really nuts creating them, it's not normally a problem.
(All storage engines allow at least 16 indexes. Some allow many
more.) Note that having lots of indexes slows down inserting and
deleting rows, and updating key fields in rows. Depending on the
ratio of looking up records vs. changing them in the mix of queries
your application uses, this can be a good thing or a bad thing.

(There are a few applications for a write-mostly, read-rarely
database, such as an ISP recording dynamic IP addresses assigned
to users (where login time is not billed). There might be 100,000
logins/logouts a day, and half a dozen lookups of the form "who had
this IP at this time" from abuse complaints.)


Quote:
I've noticed that certain queries are very slow when selecting on a column that isn't a key.

So my question is can a new key be added now?
Yes.

Quote:
Will that destroy any contents of the current table?
No. However, if you attempt to add a unique index when there are
duplicates in the table, adding the index will fail. Make it a
non-unique index or somehow eliminate the duplicates. Also, adding
an index with a large quantity of data can be slow.


Quote:
Will that speed up the results of select statements on columns that aren't keys?
With intelligent selection of the new key, (which might involve
multiple columns) it should speed up the query. If you do it while
your application is running, it may pause for a while while the
index is being created, depending on how much data you have in
there. Unless you are doing something real-time and safety-critical
like guiding a missile, this is not normally a problem.

Reply With Quote
  #6  
Old   
SpreadTooThin
 
Posts: n/a

Default Re: Index after the fact? - 11-27-2011 , 10:44 AM



Thanks very much for your detailed reply.

I was wondering if there is a way to analyze the sql that is being used to access the database and customize the indices based on the sql?

Reply With Quote
  #7  
Old   
The Natural Philosopher
 
Posts: n/a

Default Re: Index after the fact? - 11-27-2011 , 12:05 PM



SpreadTooThin wrote:
Quote:
Thanks very much for your detailed reply.

I was wondering if there is a way to analyze the sql that is being used to access the database and customize the indices based on the sql?
I am no guru, but look at the 'explain' command.

It is easy to add indices after the creation of a table.

My approach is to index until performance is acceptable - but beware of
caching. MySQL caches results so unless you change data between tests a
second lookup is massively faster than the first.

Reply With Quote
  #8  
Old   
Gordon Burditt
 
Posts: n/a

Default Re: Index after the fact? - 11-28-2011 , 07:14 PM



Quote:
Thanks very much for your detailed reply.

I was wondering if there is a way to analyze the sql that is being used to access the database and customize the indices based on the sql?
If you need to get hold of the queries being used, you can turn on
the MySQL log. Or look at the source code of your applications.
Then look at the SQL.

First, if you have auto_increment fields, you need a unique index
on that field (probably a primary key).

Also, if you need to enforce uniqueness on the values in a particular
column or combination of particular columns, you need a unique index
on that (the columns may be in any order). This is a candidate for
a primary key if you don't already have one.

Now, look at the WHERE and ON clauses used in your queries.

WHERE field = 'value'
or WHERE field > 'value'
or WHERE field < 'value'
suggests an index on field.

WHERE field1 = 'value1' AND field2 = 'value2'
suggests an index on (field1, field2) or (field2, field1).

WHERE field LIKE '%value%'
does not suggest an index, since there is no leading character
in the pattern.

WHERE field LIKE 'value%'
(with no leading % or _ wildcards) suggests an index on field.

WHERE last_name = 'value1' and first_name = 'value2' and
middle_name > 'S'
suggests an index on (last_name, first_name, middle_name) or
(first_name, last_name, middle_name). The exact-equality tests go
first, then an inequality test or a LIKE with a non-wildcard prefix.

If you JOIN two tables on one or more fields being equal,
that suggests an index on those fields (in any order).

Now, try to merge all the suggestions into a reasonable (and small)
set of indexes. Keep in mind that the most important indexes are
those suggested by queries that are:
- run most often
- deal with the most data
- are most time-critical
- are the most time-consuming
- are most selective (narrow down the choices the most)

There is no point in having multiple indexes on the same fields in
the same order on the same table. Drop duplicates.

If a suggested index has multiple fields, and one or more of those
fields is a unique index or primary key compared for exact equality,
you might as well drop all but one unique value, as the unique value
will select at most one record, and the others do not add much (that
one record will either be included or excluded by the values of the
other fields). Since you already have a unique index for the
uniqueness, no additional indexes are needed for that suggested
index.

Order of fields in an index matters. Phone books are ordered by
(last_name, first_name), and that ordering is pretty useless for
searching for first_name='George' AND street_address='123 Main
Street'. You end up having to scan the whole book.

An index on (a, b, c, d) works as an index on (a), (a, b), and (a, b, c)
as well, but *not* on (b, c, d).

An index on (a, b) works as a *PARTIAL* substitute for an index on
(a, b, c), or (a, b, c, d) (but it's slightly slower).

MySQL generally uses at most one index on the WHERE clause and one
index on each JOIN. An index on (a) and a separate index on (b)
is not a complete substitute for an index on (b, a) or (a, b) (but
it is a partial substitute, using the index on (a) as a partial
substitute for one on (a,b) ).

MySQL can also use an index for the fields in an ORDER BY clause.

An index on a field that has few different values does little to
improve the search. For example, for a customer database for a
lawn-mowing service or other *local* small business in Fort Worth,
Texas, an index on (planet, country, state, city) as opposed to
just (city) may be rather pointless since 99.9% of them are for
('Earth', 'United States', 'Texas', city), (hint: Fort Worth is
not near the Texas state boundary) and, surprise, no entries for
Australia, France, Oklahoma, or Mars. For a mail-order business,
this can be very different.


Example: I have a database of video recordings, which is accessed
by private web pages. The 'recording' table has one entry for each
copy of a particular recording. There are other tables I'm not
going to mention here.

1. The table has an `id` auto_increment field. This is the primary key.

2. A given `track` on a given `disk` may contain no more than one
episode recording, so this suggests a unique index on (disk, track)
or (track, disk) to prevent duplicates.

The web site has several pages that use the following WHERE clauses:

3. WHERE series = 'value'
(all the recordings for a particular series)
suggests an index on (series).
4. WHERE series = 'value1' and prod = 'value2'
(all the recordings of a particular episode (series
and production number) - there can be more than one
such recording)
suggests an index on (series, prod) or (prod, series)
5. WHERE disk = 'value'
(all the recordings on a particular disk)
suggests an index on (disk)

6. Updates to particular records, for example, when the recording is
moved from a DVR hard disk to a DVD, are done by:
WHERE id = 'value'.

There are other fields, such as recording length, recording date,
quality, and title, which do not need an index. You might think
that this is not normalized, as title might seem to depend only on
(series, prod), but it turns out that the same episode can have
different titles on different recordings.

Ok, we've got suggestions for indexes:

1. (id) unique
2. (disk, track) or (track, disk) unique
3. (series)
4. (series, prod) or (prod, series)
5. (disk)
6. (id)

I need a primary key of (id) to satisfy #1, which also covers #6.
From #2, I choose a unique index on (disk, track) which also satisfies #5.
If I had chosen (track, disk), that wouldn't satisfy #5.
From #4, I choose a non-unique index on (series, prod) which also
satisfies #3.

That's 3 indexes.

Reply With Quote
  #9  
Old   
SpreadTooThin
 
Posts: n/a

Default Re: Index after the fact? - 12-01-2011 , 01:15 PM



I just had this sinking feeling...
While the ALTER TABLE command is in progress, should clients be stopped from doing inserts into that table?

because.. they are!!!!

Reply With Quote
  #10  
Old   
The Natural Philosopher
 
Posts: n/a

Default Re: Index after the fact? - 12-01-2011 , 02:22 PM



SpreadTooThin wrote:
Quote:
I just had this sinking feeling...
While the ALTER TABLE command is in progress, should clients be stopped from doing inserts into that table?

because.. they are!!!!

I've never bothered ;-)

My guess is it locks the table against inserts while indexing...oddly
enough MySQL seems to have been written by people with brains larger
than peas. And it would seem the logical thing to do.

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.