dbTalk Databases Forums  

Query directly from secondary index

comp.databases.paradox comp.databases.paradox


Discuss Query directly from secondary index in the comp.databases.paradox forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Jim Moseley
 
Posts: n/a

Default Query directly from secondary index - 10-18-2007 , 01:03 AM







Greetings all.

I think I've stumbled onto a way to speed up queries, but am looking for
a second opinion.

I have a table with several secondary indexes, one of which contains the
'Name' & 'Key' fields. I want to perform a generic match on the 'Name' field,
to return all rows that have Name = "..Jim..".

If I perform my query on the DB file, it will open & search the entire table,
ignoring the 2nd index because of the wildcards.

If I perform my query directly against the appropriate XG* file, it will
perform the query on a 'mini-table' of just the Name & Key columns, which
should be much smaller than the entire DB table.

Does this make sense?

For example:

Query
Customer.XG1 | Key | Name |
Quote:
Check | CheckPlus ..Jim.. |
endQuery

Note that I tried to open a tcursor against the 'Customer.XG1' file & it
wouldn't let me. The query engine must work differently.

Extra credit: by doing a enumIndexStruct on the table, does the iIndexID
column map directly to the 'XG*' name, as 'X' + string(hex(iIndexID + 1))?
I'm guessing this is the case, because the iIndexID values start with 256
and go up by 1 for each additional index.

Thanks,
Jim Moseley


Reply With Quote
  #2  
Old   
Anders Jonsson
 
Posts: n/a

Default Re: Query directly from secondary index - 10-18-2007 , 01:40 AM






Quote:
I think I've stumbled onto a way to speed up queries, but am looking for
a second opinion.

I have a table with several secondary indexes, one of which contains the
'Name' & 'Key' fields. I want to perform a generic match on the 'Name'
field,
to return all rows that have Name = "..Jim..".
I'm not sure but will a search like that ever be able to take advantage of
an index? To be able to find all match any routine must go through all
records no matter how they are organized. I'm just guessing, I could be very
wrong here.

If your search had been for "Jim.." it would have been a different story and
then I would have used a tCursor opened on the secondary index and then used
setrange.

Anders





Reply With Quote
  #3  
Old   
Steven Green
 
Posts: n/a

Default Re: Query directly from secondary index - 10-18-2007 , 04:47 AM



if you rename the x* fle, to something.db, it IS a table you can work with..
but then you'd need to do a 2nd query to grab the remaining info you'll
need, from the main table.. and, FWIW, you're still doing a dot-dot query,
that has to go thru all records..

--
Steven Green - Myrtle Beach, South Carolina USA

Diamond Software Group
http://www.diamondsg.com/main.htm
Paradox Support & Sales

Diamond Sports Gems
http://www.diamondsg.com/gemsmain.htm
Sports Memorabilia and Trading Cards

"Jim Moseley" <jmose (AT) mapson (DOT) attglobal.net> wrote

Quote:
Greetings all.

I think I've stumbled onto a way to speed up queries, but am looking for
a second opinion.

I have a table with several secondary indexes, one of which contains the
'Name' & 'Key' fields. I want to perform a generic match on the 'Name'
field,
to return all rows that have Name = "..Jim..".

If I perform my query on the DB file, it will open & search the entire
table,
ignoring the 2nd index because of the wildcards.

If I perform my query directly against the appropriate XG* file, it will
perform the query on a 'mini-table' of just the Name & Key columns, which
should be much smaller than the entire DB table.

Does this make sense?

For example:

Query
Customer.XG1 | Key | Name |
| Check | CheckPlus ..Jim.. |
endQuery

Note that I tried to open a tcursor against the 'Customer.XG1' file & it
wouldn't let me. The query engine must work differently.

Extra credit: by doing a enumIndexStruct on the table, does the iIndexID
column map directly to the 'XG*' name, as 'X' + string(hex(iIndexID + 1))?
I'm guessing this is the case, because the iIndexID values start with 256
and go up by 1 for each additional index.

Thanks,
Jim Moseley



Reply With Quote
  #4  
Old   
Jim Moseley
 
Posts: n/a

Default Re: Query directly from secondary index - 10-18-2007 , 07:05 AM




Steven,

Quote:
but then you'd need to do a 2nd query to grab the remaining
info you'll need, from the main table..
My app handles this already. In this routine, all I need is a table containing
the 'Key' values. If there is more than one row, I show the user a screen
asking them to choose. Its DM has the Answer.db linked 1-1 to Customer.db
(to show other descriptive fields).

Quote:
and, FWIW, you're still doing a dot-dot query,
that has to go thru all records..
Exactly my point. By directly querying the Customer.XG1 file, I avoid querying
Customer.DB. Since the XG1 only has 2 columns, it should be much faster
than going after my 'wide' tables.

Thanks,
Jim Moseley


Reply With Quote
  #5  
Old   
Jim Moseley
 
Posts: n/a

Default Re: Query directly from secondary index - 10-18-2007 , 07:16 AM




Anders

Quote:
I'm not sure but will a search like that ever be able to take advantage
of
an index?
But, it will only go through all the rows of the index, not the base table.
At one client, Customer.db is 650kb but Customer.XG1 is only 66kb. 10 times
smaller should be 10 times faster, or more since I'm not getting network
retries.


Quote:
If your search had been for "Jim.." it would have been a different story

Right, but I didn't want to force my users to enter data a specific way -
'Jim Moseley' or 'Moseley, Jim' or whatever. Also, it helps them if they
can't remember how to spell, ie. 'McGuire' 'Maguire' 'MacGuire' can all be
found with '..guire..'.

Thanks,
Jim Moseley


Reply With Quote
  #6  
Old   
Steven Green
 
Posts: n/a

Default Re: Query directly from secondary index - 10-18-2007 , 08:15 AM



Jim.. basically, you're just trying to mimic default query/search behavior
that's already there.. and, if the tables are that small, the network has to
be *really* slow for all this to make a noticable difference.. however, your
theory is sound, so there's nothing to lose by testing it..

--
Steven Green - Myrtle Beach, South Carolina USA

Diamond Software Group
http://www.diamondsg.com/main.htm
Paradox Support & Sales

Diamond Sports Gems
http://www.diamondsg.com/gemsmain.htm
Sports Memorabilia and Trading Cards

"Jim Moseley" <jmose (AT) mapson (DOT) attglobal.net> wrote

Quote:
Steven,

but then you'd need to do a 2nd query to grab the remaining
info you'll need, from the main table..

My app handles this already. In this routine, all I need is a table
containing
the 'Key' values. If there is more than one row, I show the user a screen
asking them to choose. Its DM has the Answer.db linked 1-1 to Customer.db
(to show other descriptive fields).

and, FWIW, you're still doing a dot-dot query,
that has to go thru all records..

Exactly my point. By directly querying the Customer.XG1 file, I avoid
querying
Customer.DB. Since the XG1 only has 2 columns, it should be much faster
than going after my 'wide' tables.

Thanks,
Jim Moseley



Reply With Quote
  #7  
Old   
Anders Jonsson
 
Posts: n/a

Default Re: Query directly from secondary index - 10-18-2007 , 10:20 AM




Quote:
But, it will only go through all the rows of the index, not the base
table.
At one client, Customer.db is 650kb but Customer.XG1 is only 66kb. 10
times
smaller should be 10 times faster, or more since I'm not getting network
retries.
I'm certainly no expert on this but wouldn't the file size of .db file have
a rather minor impact?

Your only search ONE column and then it must be number of records that is
most important. I assume you have exactly the same number of records in both
..db and .xg1?

But maybe the entire table needs to go over the network even if you search
only one column? I have no idea, but even then 650 kb is not much on a
modern network.


Quote:
Right, but I didn't want to force my users to enter data a specific way -
'Jim Moseley' or 'Moseley, Jim' or whatever. Also, it helps them if they
can't remember how to spell, ie. 'McGuire' 'Maguire' 'MacGuire' can all be
found with '..guire..'.
I understand, but in there lies your problem. If you want a flexible search
you will have to offer some performance.

But how slow are your searches?

If you had split your name in to fields firstname and lastname you could
have given the user options, either "fast search" or a slower but more
flexibel one.

Sometimes when I have a situation where performance is really crucial, I
would try to "cache" the data to the users PRIV. All operations against
tables in PRIV are always faster than in any other folder, even compared to
other folders on C. I have no idea if this could be used in you case but it
might be worth a try?

Finally I would also compare a tCursor scan to a query, sometimes I have the
impression that cursors could be quicker even when no index is used.


Anders






Reply With Quote
  #8  
Old   
Tony McGuire
 
Posts: n/a

Default Re: Query directly from secondary index - 10-18-2007 , 10:53 AM





Quote:
But, it will only go through all the rows of the index, not the
base table.

I'm certainly no expert on this but wouldn't the file size of .db
file have a rather minor impact?
Imagine a 100MB file on network server. With a 500k index file.

A query draws the entire table local, and THEN processes the query.

Which file would you prefer to bring local for the query to act on?

--
------------------------------
Tony McGuire




Reply With Quote
  #9  
Old   
Anders Jonsson
 
Posts: n/a

Default Re: Query directly from secondary index - 10-18-2007 , 12:05 PM



Quote:
Imagine a 100MB file on network server. With a 500k index file.
Which is ofcourse very different from a 650kb with 50 k index file. The real
difference on small files is almost impossible to see.

Quote:
A query draws the entire table local, and THEN processes the query

Which file would you prefer to bring local for the query to act on?

Thanks for the clarification, I wasn't sure that it brought ALL columns
over, even the one that is not included in the query. How about a tcursor
then? Will a scan of all records eventually bring the enterie table over the
network, even the columns that is not refered to by the tcursor?


Anders




Reply With Quote
  #10  
Old   
Jim Moseley
 
Posts: n/a

Default Re: Query directly from secondary index - 10-18-2007 , 12:53 PM




Anders,

Quote:
Imagine a 100MB file on network server. With a 500k index file.
This was exactly my problem. One of my clients had a 20MB Customer table,
whereas all others have under 1MB. Their network is dying, and I'm trying
to speed it up. Sorry I didn't post this number earlier, but I didn't have
an exact size on the XG1 file.


Quote:
How about a tcursor then? Will a scan of all records
eventually bring the enterie table over the
network, even the columns that is not refered to by the tcursor?
Sadly, you can't open a tcursor on the XG1 directly - it gives a message
'Table is not indexed' (ironically).

If you open a tcursor on the base table, it will 'pre-fetch' 8 pages based
on the block size. Then, as you jump around the table, it will grab whatever
pages are needed. So, if you scan the entire table, you'll need all 20MB
in my above example.

One other benefit to the XG1: if you open a tcursor on a table, it will
open all table lookups also. I'm assuming it doesn't want to slow down tc.edit()
(or the f9 key), so it always assumes you'll edit. I'm not sure, but these
might also get 8 pages 'pre-fetched'.

I'm getting somewhat proficient with the free WireShark network monitor.
http://www.wireshark.org I you have speed issues, this is the best tool
I've found. It replaced Ethereal a while back.

Thanks,
Jim Moseley


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.