dbTalk Databases Forums  

Re: Trouble with query optimizer and custom storage engine: excessive query times

mailing.database.mysql-internals mailing.database.mysql-internals


Discuss Re: Trouble with query optimizer and custom storage engine: excessive query times in the mailing.database.mysql-internals forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Sergei Golubchik
 
Posts: n/a

Default Re: Trouble with query optimizer and custom storage engine: excessive query times - 08-18-2006 , 11:53 AM






Hi!

On Aug 18, Olaf Barthel wrote:
Quote:
Through the last few months I have been working on a custom storage
engine, which is optimized for a specific application. I have come very
far indeed, but it was a very thorny road :-(
....
For purposes of comparison I filled two tables with the same data (some
130 million rows). One table uses my storage engine, the other table
uses innodb. The same query command produces very different results.
Here is the command:
....
I checked what it is that the query optimizer wants to know about my
storage engine table, and it's not a lot: number of rows (info()
method), cost of a table scan (scan_time() method), cost of an index
access (read_time() method). The records_in_range() method is not
called, and the cost returned by scan_time() is about 10^11 times larger
than the one returned by read_time().

I'm stumped. Why would the query optimizer prefer such an abismal
choice? And how would I rectify this? Note that changing the query
command to a "select straight_join" is not really an option for me.
While it cuts the number of rows involved from 95706810 to 638075,
it's far from the 693 rows the innodb query uses. Whatever problems
there is, it ought to be resolved within the storage engine domain.
You mean when you fix the join order with straight_join, optimizer still
chooses the table scan ? Try to add a FORCE INDEX to tell the optimizer
you really want to use the PRIMARY KEY - though it may not help if
optimizer thinks that it's not possible to use the PRIMARY KEY here.

And check best_access_path() function (in sql_select.cc) in a debugger,
to see why it prefers the table scan.

Regards,
Sergei

--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg (AT) mysql (DOT) com>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer
/_/ /_/\_, /___/\___\_\___/ Kerpen, Germany
<___/ www.mysql.com

--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe: http://lists.mysql.com/internals?uns...ie.nctu.edu.tw



Reply With Quote
  #2  
Old   
Paul McCullagh
 
Posts: n/a

Default Re: Trouble with query optimizer and custom storage engine: excessive query times - 08-18-2006 , 12:08 PM






Hi Olaf,

I have spent quite a bit of time going through with a debugger to see
how this stuff works myself.

Without implementing records_in_range(), this is the best I have come
up with for PBXT so far:

double ha_pbxt::scan_time()
{
return (double) (records + deleted) / 38.0 + 2;
}

double ha_pbxt::read_time(ha_rows rows)
{
return (double) rows / 38.0 + 1;
}

ha_rows ha_pbxt::records_in_range(uint inx, key_range *min_key,
key_range *max_key)
{
return 1;
}

Give it a try, and let me know how it goes.

Best regards,

Paul

On Aug 18, 2006, at 4:35 PM, Olaf Barthel wrote:

Quote:
Through the last few months I have been working on a custom storage
engine, which is optimized for a specific application. I have come
very far indeed, but it was a very thorny road :-(

That storage engine supports indexing, but so far I have only
activated it for the primary key. As far as my tests tell me, the
storage engine does what it is supposed to. There is, however, one
curious fact which threatens to make all the work I put into this
project obsolete. That fact is in that certain table queries can
literally take days to complete.

For purposes of comparison I filled two tables with the same data
(some 130 million rows). One table uses my storage engine, the
other table uses innodb. The same query command produces very
different results. Here is the command:

select d.symbol,d.syscode,if(min(h.datum)>date_sub(now(), interval
12 month),-1,0) as fehlende_tage , date_add(min(h.datum),interval
28 day) as min_datum, count(h.datum)-20 as anzahl_kurse from
system.dtboerse_ranglisten as d left join master.historien as h on
h.symbol=d.symbol and (h.boerse in ('xet')) and h.datum>=date_sub
(now(),interval 12 month) and weekday(h.datum)<5 where
d.rank_turnover is not null and h.datum>=date_sub(now(),interval
12 month) group by d.symbol,d.syscode,d.seg_id\G

For the innodb table the results of "explain select..." are as
follows:

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: d
type: ALL
possible_keys: symsys
key: NULL
key_len: NULL
ref: NULL
rows: 693
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: h
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 26
ref: system.d.symbol,const
rows: 7
Extra: Using where; Using index
2 rows in set (0.00 sec)

For my own storage engine the following results are produced:

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: h
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 95706810
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: d
type: ref
possible_keys: symsys
key: symsys
key_len: 22
ref: master.h.symbol
rows: 2
Extra: Using where
2 rows in set (0.01 sec)

Note the number of rows involved: for my own storage engine the
query optimizer always prefers to run a complete table scan, which
with this number of rows is deadly.

I checked what it is that the query optimizer wants to know about
my storage engine table, and it's not a lot: number of rows (info()
method), cost of a table scan (scan_time() method), cost of an
index access (read_time() method). The records_in_range() method is
not called, and the cost returned by scan_time() is about 10^11
times larger than the one returned by read_time().

I'm stumped. Why would the query optimizer prefer such an abismal
choice? And how would I rectify this? Note that changing the query
command to a "select straight_join" is not really an option for me.
While it cuts the number of rows involved from 95706810 to 638075,
it's far from the 693 rows the innodb query uses. Whatever problems
there is, it ought to be resolved within the storage engine domain.

Any help would be appreciated! I'm using MySQL 5.0, because that's
the version I started implementing my custom storage engine with.

--
Home: Olaf Barthel, Gneisenaustrasse 43, D-31275 Lehrte
Net: olsen (AT) sourcery (DOT) han.de (Home), o.barthel (AT) logical-line (DOT) de (Work)

--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe: http://lists.mysql.com/internals?
unsub=paul.mccullagh (AT) primebase (DOT) com

__ _______
\ \/ _ _/ Paul McCullagh
\ / | | SNAP Innovation GmbH
/ \ | | Altonaer Poststr 9a
/ /\ \| | 22767 Hamburg
------------ Germany
PrimeBase XT www.primebase.com/xt


--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe: http://lists.mysql.com/internals?uns...ie.nctu.edu.tw



Reply With Quote
  #3  
Old   
Sergei Golubchik
 
Posts: n/a

Default Re: Trouble with query optimizer and custom storage engine: excessive query times - 08-20-2006 , 10:22 AM



Hi!

On Aug 20, Olaf Barthel wrote:

Quote:
Try to add a FORCE INDEX to tell the optimizer
you really want to use the PRIMARY KEY - though it may not help if
optimizer thinks that it's not possible to use the PRIMARY KEY here.

Adding 'force index (primary)' in the right place had no effect
whatsoever :-( The "explain select..." query produces exactly the same
results as if I had omitted the 'force index (primary)' text.

I have a hunch that the optimizer considers the index on the primary key
to be unusable. Here is what a "show index from master.historien" query
produces:

*************************** 1. row ***************************
Table: historien
Non_unique: 0
Key_name: PRIMARY
Seq_in_index: 1
Column_name: symbol
Collation: A
Cardinality: NULL
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
*************************** 2. row ***************************
....
What I do not understand yet is why the cardinality is NULL for the
first two columns.
NULL cardinality does not mean "unusable", in only means the cardinality
is not known. Optimizer would try to guess it, and it most probably
the guess will be completely wrong. But FORCE INDEX would override it
(it'll make table scan _very_ expensive, so any usable index, no matter
how bad, will be better than a table scan). As FORCE INDEX didn't help,
it means, indeed, that optimizer considers your index completely
unusable for this query, not just an index with bad selectivity.

Quote:
And check best_access_path() function (in sql_select.cc) in a debugger,
to see why it prefers the table scan.

I didn't dare to go the hard way yet ;-)
It may be easier than it looks
Check the big if() in that function. The if() that has a big comment
block before it, and that has different conditions in if() numbered as
(1), (2), and so on.

Quote:
Some fiddling with the
built-in DBUG package produced some interesting information, though: the
primary key index is in fact completely ignored in test_quick_select(),
as called by get_quick_record_count().
test_quick_select() and get_quick_record_count() - they're part of the
range optimizer. And range optimizer only works with constant ranges.
In your query you have

ON h.symbol=d.symbol AND ...

so a range optimizer isn't used here. That's why records_in_range()
isn't called either.

Regards,
Sergei

--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg (AT) mysql (DOT) com>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer
/_/ /_/\_, /___/\___\_\___/ Kerpen, Germany
<___/ www.mysql.com

--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe: http://lists.mysql.com/internals?uns...ie.nctu.edu.tw



Reply With Quote
  #4  
Old   
Sergei Golubchik
 
Posts: n/a

Default Re: Trouble with query optimizer and custom storage engine: excessive query times - 08-20-2006 , 02:36 PM



Hi!

On Aug 20, Olaf Barthel wrote:
Quote:
Sergei Golubchik schrieb:

And check best_access_path() function (in sql_select.cc) in a debugger,
to see why it prefers the table scan.
....
Check the big if() in that function. The if() that has a big comment
block before it, and that has different conditions in if() numbered
as (1), (2), and so on.

This is the section that begins with the comment text "Don't test
table scan if it can't be better", isn't it? I can't say I understand
what choices are being made, and why, looking at the code in the
debugger.
The condition (4):

!(s->table->force_index && best_key && !s->quick)

looks like the easiest to analyze. Let's assume you used FORCE INDEX.
You said that s->quick is never true. It means best_key is false.
There's only one place where best_key is set. It's inside the if()

if (tmp < best_time - records/(double) TIME_FOR_COMPARE)

You can check what is tmp, best_time, and records. Whether the execution
gets to this if() at all, or its condition is never evaluated. If it
does - where tmp, best_time, and records got their values. And so on,
going backwards in time.

Regards,
Sergei

--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg (AT) mysql (DOT) com>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer
/_/ /_/\_, /___/\___\_\___/ Kerpen, Germany
<___/ www.mysql.com

--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe: http://lists.mysql.com/internals?uns...ie.nctu.edu.tw



Reply With Quote
  #5  
Old   
Sergei Golubchik
 
Posts: n/a

Default Re: Trouble with query optimizer and custom storage engine: excessive query times - 08-21-2006 , 04:40 AM



Hi!

On Aug 20, Olaf Barthel wrote:
Quote:
The condition (4):

!(s->table->force_index && best_key && !s->quick)

looks like the easiest to analyze. Let's assume you used FORCE INDEX.

I've looked into it, but with no success. This particular condition
evaluates to false for the FORCE INDEX case. In the problematic case
best_key != NULL, and since s->quick == NULL the search still ends up
picking the complete table scan.
Something's wrong here. According to you

s->table->force_index is true
best_key != NULL
s->quick == NULL

so the condition becomes

!(true && true && !NULL)

which is false. So if() condition is false, and full table scan
is not considered by the optimizer.

I mean, the condition above MUST evaluate to true for full table scan to
happen. In my previous reply I assumed that it happens because
best_key==NULL.

Regards,
Sergei

--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg (AT) mysql (DOT) com>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer
/_/ /_/\_, /___/\___\_\___/ Kerpen, Germany
<___/ www.mysql.com

--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe: http://lists.mysql.com/internals?uns...ie.nctu.edu.tw



Reply With Quote
  #6  
Old   
Sergei Golubchik
 
Posts: n/a

Default Re: Trouble with query optimizer and custom storage engine: excessive query times - 08-21-2006 , 02:16 PM



Hi!

On Aug 21, Olaf Barthel wrote:
Quote:
I also had another look at what the "explain select straight_join"
produces on the table managed by my storage engine:
....
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: h
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 26
ref: system.d.symbol,const
rows: 509683
Extra: Using where

For the same select command on the innodb table, it looks like this:
....
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: h
type: ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 26
ref: system.d.symbol,const
rows: 240
Extra: Using where; Using index

-- 8< --

Note the "Extra:" lines, and the missing "Using index" for my custom
storage engine.
There're two differences, as you can see.

One is very different estimation for the number of rows. It comes from
the cardinality values, so you need to get cardinality values which are
close to reality.

Second is "Using index". According to the manual it mean
"
* `Using index'

The column information is retrieved from the table using only
information in the index tree without having to do an
additional seek to read the actual row. This strategy can be
used when the query uses only columns that are part of a
single index.
"

MySQL can only use this mode if index_flags() for an index contain
HA_KEYREAD_ONLY flag. Of course, ::extra() method of your handler must
support HA_EXTRA_KEYREAD to make use of this optimization.

Regards,
Sergei

--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg (AT) mysql (DOT) com>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer
/_/ /_/\_, /___/\___\_\___/ Kerpen, Germany
<___/ www.mysql.com

--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe: http://lists.mysql.com/internals?uns...ie.nctu.edu.tw



Reply With Quote
  #7  
Old   
Sergei Golubchik
 
Posts: n/a

Default Re: Trouble with query optimizer and custom storage engine: excessive query times - 08-21-2006 , 02:41 PM



Hi!

On Aug 21, Olaf Barthel wrote:
Quote:
Sergei Golubchik schrieb:

[..]
There're two differences, as you can see.

One is very different estimation for the number of rows. It comes from
the cardinality values, so you need to get cardinality values which are
close to reality.

Hm... the cardinality values go into the table->key_info[].rec_per_key[]
fields when the info() method is called with the HA_STATUS_CONST flag,
which is what happens for the heap storage engine. How close do I have
to approximate what's in the index? A value of 0 probably won't do, but
will any non-zero value do?
A value of 0 means "no information", "unknown".
The more precise the number is, the better.
Ideally it should be, assuming a key (a,b,c)

rec_per_key[0] = SELECT COUNT(*)/COUNT(DISTINCT a) FROM t;
rec_per_key[1] = SELECT COUNT(*)/COUNT(DISTINCT a,b) FROM t;
rec_per_key[2] = SELECT COUNT(*)/COUNT(DISTINCT a,b,c) FROM t;

Quote:
Second is "Using index". According to the manual it mean
"
* `Using index'

The column information is retrieved from the table using only
information in the index tree without having to do an
additional seek to read the actual row. This strategy can be
used when the query uses only columns that are part of a
single index.
"

MySQL can only use this mode if index_flags() for an index contain
HA_KEYREAD_ONLY flag. Of course, ::extra() method of your handler must
support HA_EXTRA_KEYREAD to make use of this optimization.

Good, I can arrange that (tomorrow). What effect will this have, in
general? If I understand correctly what the other storage engines do,
they will limit the data retrieved from the rows to the portions
covering the primary key. Which is an optimization, since it won't
call in more data than necessary. Is that correct?
It's an optimization for MyISAM, because it stores indexes and data in
different files. So if HA_EXTRA_KEYREAD is in use, MyISAM can only read
index file, without reading data file at all.
For InnoDB it makes little sense, because it keeps everything in one
file anyway. InnoDB uses HA_EXTRA_KEYREAD to avoid reading blobs, as it
stores blobs on separate data pages.

Depending on the design of your storage engine, HA_EXTRA_KEYREAD may be
a win, or a completely useless hint. It is not automatically good, and
you don't have to struggle for "Using index" in the EXPLAIN output,
unless you're sure that you can benefit from this "keyread" mode.

Regards,
Sergei

--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg (AT) mysql (DOT) com>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer
/_/ /_/\_, /___/\___\_\___/ Kerpen, Germany
<___/ www.mysql.com

--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe: http://lists.mysql.com/internals?uns...ie.nctu.edu.tw



Reply With Quote
  #8  
Old   
Sergei Golubchik
 
Posts: n/a

Default Re: Trouble with query optimizer and custom storage engine: excessive query times - 08-22-2006 , 06:05 AM



Hi!

On Aug 22, Olaf Barthel wrote:
Quote:
Sergei Golubchik schrieb:

MySQL can only use this mode if index_flags() for an index contain
HA_KEYREAD_ONLY flag. Of course, ::extra() method of your handler
must support HA_EXTRA_KEYREAD to make use of this optimization.

It's an optimization for MyISAM, because it stores indexes and data
in different files. So if HA_EXTRA_KEYREAD is in use, MyISAM can only
read index file, without reading data file at all. For InnoDB it
makes little sense, because it keeps everything in one file anyway.
InnoDB uses HA_EXTRA_KEYREAD to avoid reading blobs, as it stores
blobs on separate data pages.

Depending on the design of your storage engine, HA_EXTRA_KEYREAD may
be a win, or a completely useless hint. It is not automatically good,
and you don't have to struggle for "Using index" in the EXPLAIN
output, unless you're sure that you can benefit from this "keyread"
mode.

Adding HA_KEYREAD_ONLY to the index_flags() return value made all the
difference. I can't tell you how surprised I am...

It's the more surprising to me because, if what all the other storage
engines use this flag for, it ought to have no effect on the query
optimizer's treatment of the tables. Yet it does.
It should - because it changes the cost of a row read.
It may be cheaper to scan the whole table sequentially, instead of
reading a part of if via index, if the selectivity of the index is low.
On the other hand, if "keyread" mode can be used, then using index
may be cheaper than a table scan. Optimizer takes this into account.

Quote:
But as I had to find out again and again, there is more to
implementing storage engines than there is in the documentation. For
example, just today I discovered that the documentation disagrees on
the "mode" parameter passed to the open() method. Is there anything I
can do to help improving the documentation?
Sure!
Please submit a bugreport with your findings at bugs.mysql.com.
Just put the bug in the "Documentation" category.
(see, for example, http://bugs.mysql.com/bug.php?id=19814)
Thank you for the help!

Regards,
Sergei

--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg (AT) mysql (DOT) com>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer
/_/ /_/\_, /___/\___\_\___/ Kerpen, Germany
<___/ www.mysql.com

--
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe: http://lists.mysql.com/internals?uns...ie.nctu.edu.tw



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.