![]() | |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hi Recent discussions about indexing set me to thinking that I should once again look at D3 Indices. So I did a simple test. The Invoices file contains multi-valued attributes for the invoice lines. One line is the key to the product file. We frequently wish to sort invoice sales by Product Group and Product name, both of which are in the Product file. I prefer F correlatives but popped in an "A" type for the exercise. The result is fine for LIST and SORT BY-EXP. However any attempt to create an index results in a crash. I originally posed this question when Pick first did b-trees and got a silly answer that did not address the question. |
|
Now that people have had years of practise can someone explain exactly what the problem is here ( no theories please just an answer to the exact question) In my opinion any b-tree that cannot deal with builds like this is simply a waste of space. |
|
:CT DICT INVOICES GRPDESC GRPDESC 001 S 002 0 003 Product Description 004 005 006 007 008 A13(TPRODUCTS;X;3;3]MR%3):13(TPRODUCTS;X;2;2) 009 L 010 32 LIST INVOICES GRPDESC INVOICES Product Description............. 1*514336 050Longhaul Premium 052Brake & Clutch Fluid 1*514337 052Revtex Super 2T snip SORT INVOICES BY-EXP GRPDESC GRPDESC INVOICES Product Description............. 1*516787 002Unleaded Unmarked 1*516788 009Gas 1*514336 050Longhaul Premium 1*514338 050Longhaul Premium snip :CREATE-INDEX INVOICES GRPDESC Creating index on A13(TPRODUCTS;X;3;3]MR%3):13(TPRODUCTS;X;2;2) 0002457 [3] The verb 'MR%3):13(TPRODUCTS;X;2;2)' is not defined. : There are 2457 invoices in the test file with in excess of 4,500 lines so it appears to have indexed on attribute 13 in full which would be pretty silly as it is multi-valued. Peter McMurray |
#3
| |||
| |||
|
|
"Peter McMurray" <excalibur21 (AT) bigpond (DOT) com> wrote in message news:bMWzg.5086$rP1.2589 (AT) news-server (DOT) bigpond.net.au... Hi Recent discussions about indexing set me to thinking that I should once again look at D3 Indices. So I did a simple test. The Invoices file contains multi-valued attributes for the invoice lines. One line is the key to the product file. We frequently wish to sort invoice sales by Product Group and Product name, both of which are in the Product file. I prefer F correlatives but popped in an "A" type for the exercise. The result is fine for LIST and SORT BY-EXP. However any attempt to create an index results in a crash. I originally posed this question when Pick first did b-trees and got a silly answer that did not address the question. I don't know when that was, but it wasn't when I was running that part of the engineering department. Now that people have had years of practise can someone explain exactly what the problem is here ( no theories please just an answer to the exact question) In my opinion any b-tree that cannot deal with builds like this is simply a waste of space. You're a smart guy. You could have figured this one out: :CT DICT INVOICES GRPDESC GRPDESC 001 S 002 0 003 Product Description 004 005 006 007 008 A13(TPRODUCTS;X;3;3]MR%3):13(TPRODUCTS;X;2;2) 009 L 010 32 LIST INVOICES GRPDESC INVOICES Product Description............. 1*514336 050Longhaul Premium 052Brake & Clutch Fluid 1*514337 052Revtex Super 2T snip SORT INVOICES BY-EXP GRPDESC GRPDESC INVOICES Product Description............. 1*516787 002Unleaded Unmarked 1*516788 009Gas 1*514336 050Longhaul Premium 1*514338 050Longhaul Premium snip :CREATE-INDEX INVOICES GRPDESC Creating index on A13(TPRODUCTS;X;3;3]MR%3):13(TPRODUCTS;X;2;2) 0002457 [3] The verb 'MR%3):13(TPRODUCTS;X;2;2)' is not defined. : There are 2457 invoices in the test file with in excess of 4,500 lines so it appears to have indexed on attribute 13 in full which would be pretty silly as it is multi-valued. Peter McMurray 1) It's not the indexing that is failing, it's the a-correlative in the index. Try this: A13(TPRODUCTS;X;3;2)(MR%3):13(TPRODUCTS;X;2;2) 2) Regardless how many indices there are, there are still only 2457 records. Most problems with indexing in D3 comes from not understanding what indexes are and what they are inteded to do. First and foremost, they are a list of the item ids of the file in some particular sequence. They look a little like this: indexKey <am> id <vm> id2 <vm>...<idN> <sm> indexKey2 <etc If the index key is really long, or if there are thousands of Item IDs for a single key (as with null keys) there can be problems when the leaves in the b-tree split into branches. Index correlatives that call subroutines or do multiple translates can cause problems if any of the pieces aren't there when you need them. Regardless of how many branches and leaves there are in the index, it is in fact one gigantic linked list. That being said, the English pre-compiler looks at the WITH selection list and if the first ADI has an index and the selection mode is EQUALity, AQL is smart enough to use the index to speed up selection, particularly in the sort pass, as the index is already in sequence and doesn't need to be sorted. A typical file select process starts with a frame of overflow, marches thru the file sequentially group by group, selecting items that match and placing ids into a dynamic array. Once the selection has been done, if required, the sort pass takes place by using that list to re-read the items, gather the sort criteria and then do a merging multi-list sort to arrive at a sorted id list. Using indexes, AQL uses a system level function to find the first index that matches the criteria (eg. SELECT file WITH A1 = "CA") and then cruises the links until the condition is no longer true, placing the item id's into the list and using high-speed search technique simalar to how items are read and GFE's detected. When the list is built, the sort phase is skipped and we're ready to go. Typically the number of frames in the index is significantly less than actual file size, so even selecting the entire file would be faster just from simply working less. Since the index is maintained automatically at the system level, for large files, there's nothing better. Sometimes you can create "pseudo-"indexes to facilitate reporting. For example and index like An(state)(ml#3):n(district) would pre-sort the sales file by state by district even before reporting is required. Mark Brown |
#4
| ||||
| ||||
|
|
Hi Mark Thanks for the rapid response ( DO you work all night?) Unfortunately the suggested correlative does not work in SORT BY-EXP as my original did. Nor does the new method bring back a sorted list with say SORT INVOICES BY-EXP GRPDESC "052]" it just comes back with no items present. In trying to look at the key I could be wrong with my code but using the 'N' option to race through all the keys it appears to have just made a list of all the invoice lines unsorted. |
|
If I take into account the enormously messy way I have to actually reproduce the English correlative in Basic and I would typically have upto 9 or 10 attributes in a btree key. I am wondering if it is worthwhile on anything but the most simplistic sorts. |
|
By the way I meant no disrespect to you. In fact I have fond memories of the best presentation ever given over a couple of days was yours on Microsoft interfacing. I don't remember who it was that answered my original query but they certainly avoided complex keys that I specifically asked about. I certainly have a better understanding of what Pick is doing now from your post and it is not what I was expecting. Perhaps you could volunteer a complex English correlative that I could work through |
|
As for Btrees I have been using them for nigh on 25 years as Micromax had a brilliant implementation and I sold the first of the Micromax 3000's of the convention floor in 1982. The B-trees worked so well the client ran around telling everybody how his new system was 10 times faster than his direct IBM mainframe link to BP with the same data (we actually picked up 25,000 customers from the mainframe tapes so the data was very similar). Regards Peter McMurray I've been working on B-trees sinced they were still called balanced trees, |
#5
| |||
| |||
|
#6
| |||
| |||
|
|
Two points: 1. Not sure, but perhaps the index work better if attribute 2 of the dictionary item was 13? i.e.: GRPDESC 001 S 002 13 <----------------------------- this is a mv field, right? 003 Product Description 004 005 006 007 008 A13(TPRODUCTS;X;3;3]MR%3):13(TPRODUCTS;X;2;2) 009 L 010 32 Also, I have had good results using called basic correlatives for complicated indexes. Not that this is very complex, but you might try: 008 a13(call products.grpdesc.sub) Also, I have never tried sort by-exp with indexes, but you could always first select using the indexed field, then sort by-exp; something like: select invoices with grpdesc eq "123myproduct" sort invoices by-exp grpdesc eq "123myproduct" 2. Aren't you breaking the cardinal rule of indexes: creating an index on a field in another file? What happens if the either attribute <2> or 3> in the products file is changed? Unless you have a fancy callx subroutine in the products D-pointer that goes out and deletes and then re-creates the affected invoices records, you are hosed. /Scott Ballinger Pareto Corporation Edmonds WA USA 206 713 6006 |
#7
| |||
| |||
|
|
"Peter McMurray" <excalibur21 (AT) bigpond (DOT) com> wrote in message news:4sZzg.5215$rP1.4602 (AT) news-server (DOT) bigpond.net.au... Hi Mark Thanks for the rapid response ( DO you work all night?) Unfortunately the suggested correlative does not work in SORT BY-EXP as my original did. Nor does the new method bring back a sorted list with say SORT INVOICES BY-EXP GRPDESC "052]" it just comes back with no items present. In trying to look at the key I could be wrong with my code but using the 'N' option to race through all the keys it appears to have just made a list of all the invoice lines unsorted. Here's my test file. Only three records and not much data :list test pd Page 1 test 04:20:56 02 Aug 2006 test...... Prod Desc..................... 2 003product 3 3 001product 1 003product 3 1 002product 2 001product 1 [405] 3 items listed out of 3 items. :sort test pd by pd Page 1 test 04:21:18 02 Aug 2006 test...... Prod Desc..................... 3 001product 1 003product 3 1 002product 2 001product 1 2 003product 3 [4051] 3 items listed. :sort test pd by-exp pd Page 1 test 04:21:22 02 Aug 2006 test...... Prod Desc..................... 1 001product 1 3 001product 1 1 002product 2 2 003product 3 3 003product 3 [4051] 5 items listed. My a-corellative was a1(TPROD;X;3;3)(MR%3):1(TPROD;X;2;2) I'm running 7.4.4.400 If I take into account the enormously messy way I have to actually reproduce the English correlative in Basic and I would typically have upto 9 or 10 attributes in a btree key. I am wondering if it is worthwhile on anything but the most simplistic sorts. I'd like to see something like that. Usually I find we tend to over-complicate things. By the way I meant no disrespect to you. In fact I have fond memories of the best presentation ever given over a couple of days was yours on Microsoft interfacing. I don't remember who it was that answered my original query but they certainly avoided complex keys that I specifically asked about. I certainly have a better understanding of what Pick is doing now from your post and it is not what I was expecting. Perhaps you could volunteer a complex English correlative that I could work through Don't worry (be happy). I get plenty of disrespect, so I never feel short changed. How about: aa (correl: tprod;x;3;3) and bb (tprod;x;2;2) and then you can an(aa)(mr%3):n(bb) As for Btrees I have been using them for nigh on 25 years as Micromax had a brilliant implementation and I sold the first of the Micromax 3000's of the convention floor in 1982. The B-trees worked so well the client ran around telling everybody how his new system was 10 times faster than his direct IBM mainframe link to BP with the same data (we actually picked up 25,000 customers from the mainframe tapes so the data was very similar). Regards Peter McMurray I've been working on B-trees sinced they were still called balanced trees, when there were a fixed number of indexes per node and that was it. You designed your tree with 1 block at the top, a second level with as many blocks as there were keys in the first block, etc until you ended up with the last layer which held one key:id pair for each record in the file. Theoretically, it never took more than levels + 1 read to get to any record. BTW, I use indexed a LOT in Visual Basic and DOT Net. It's much faster to read an index, put the display key into a pull-down or listbox and store the item id than to sselect the file and read every record to get the same information. Mark Brown |
#8
| |||
| |||
|
|
Here are some examples. The first two are on the stock movement file and all the information is contained within the stock movement record. Enquiries cannot wait for a full sort of the file. An algebraic dictionary is going to be a mess. More importantly the second will only exist for Stock Purchases and Returns. STKMVEBT1*1 001 Stock Movements By Product & Pack - Btree 002 003 STKMVEBT1F 004 KEY 005 CO "R%2" 006 PRODUCT "R%6" 007 PACK "R%8" 008 DEPOT "L#4" 009 LOCCODE "L#4" 010 PERNO "R%7" 011 MVETYPE "R%2" 012 MVEDAT "R%5" 013 SOURCECO "R%2" 014 BATCHNO "R%6" 015 BLINE "R%3" 016 BATCHTYPE "L#1" : STKMVEBT2*1 001 Stock Movements From Suppliers - Btree 002 003 STKMVEBT2F 004 KEY 005 MVESOURCE "L#8" 006 MVEDOC "R%8" 007 SOURCECO "R%2" 008 DEPOT "L#4" 009 BATCHNO "R%6" 010 BLINE "R%2" : This combines Name information from the Debtors master file, which controls credit, with Address information from the Address file which controls pricing and delivery information. The Aboriginal centre may have deliveries to 40 or 50 settlements, the Seventh Day Adventists may have deliveries to a dozen of their pastors, ColesMyer could have 700 to 800 delivery points. NAMEBT*1 001 The Name Enquiry Btree File 002 003 NAMEBTF 004 KEY 005 SORTNAME "L#10" 006 CO "R%2" 007 NUMBER "R%6" 008 ADDNO "R%3" This allows us to check delivery requests by street. Dodgy debtors often come in with different names but the same delivery address - it is not much good having your heating oil delivered to somone else's tank. STREETBT*1 001 The Street Enquiry Btree File 002 003 STREETBTF 004 KEY 005 SORTSTREET "L#10" 006 ADDNUM "R%5" 007 CO "R%2" 008 NUMBER "R%6" 009 ADDNO "R%3" : "Mark Brown" <mbrown (AT) drexelmgt (DOT) com> wrote in message news:v60Ag.8218$Vq1.4667 (AT) tornado (DOT) socal.rr.com... "Peter McMurray" <excalibur21 (AT) bigpond (DOT) com> wrote in message news:4sZzg.5215$rP1.4602 (AT) news-server (DOT) bigpond.net.au... Hi Mark Thanks for the rapid response ( DO you work all night?) Unfortunately the suggested correlative does not work in SORT BY-EXP as my original did. Nor does the new method bring back a sorted list with say SORT INVOICES BY-EXP GRPDESC "052]" it just comes back with no items present. In trying to look at the key I could be wrong with my code but using the 'N' option to race through all the keys it appears to have just made a list of all the invoice lines unsorted. Here's my test file. Only three records and not much data :list test pd Page 1 test 04:20:56 02 Aug 2006 test...... Prod Desc..................... 2 003product 3 3 001product 1 003product 3 1 002product 2 001product 1 [405] 3 items listed out of 3 items. :sort test pd by pd Page 1 test 04:21:18 02 Aug 2006 test...... Prod Desc..................... 3 001product 1 003product 3 1 002product 2 001product 1 2 003product 3 [4051] 3 items listed. :sort test pd by-exp pd Page 1 test 04:21:22 02 Aug 2006 test...... Prod Desc..................... 1 001product 1 3 001product 1 1 002product 2 2 003product 3 3 003product 3 [4051] 5 items listed. My a-corellative was a1(TPROD;X;3;3)(MR%3):1(TPROD;X;2;2) I'm running 7.4.4.400 If I take into account the enormously messy way I have to actually reproduce the English correlative in Basic and I would typically have upto 9 or 10 attributes in a btree key. I am wondering if it is worthwhile on anything but the most simplistic sorts. I'd like to see something like that. Usually I find we tend to over-complicate things. By the way I meant no disrespect to you. In fact I have fond memories of the best presentation ever given over a couple of days was yours on Microsoft interfacing. I don't remember who it was that answered my original query but they certainly avoided complex keys that I specifically asked about. I certainly have a better understanding of what Pick is doing now from your post and it is not what I was expecting. Perhaps you could volunteer a complex English correlative that I could work through Don't worry (be happy). I get plenty of disrespect, so I never feel short changed. How about: aa (correl: tprod;x;3;3) and bb (tprod;x;2;2) and then you can an(aa)(mr%3):n(bb) As for Btrees I have been using them for nigh on 25 years as Micromax had a brilliant implementation and I sold the first of the Micromax 3000's of the convention floor in 1982. The B-trees worked so well the client ran around telling everybody how his new system was 10 times faster than his direct IBM mainframe link to BP with the same data (we actually picked up 25,000 customers from the mainframe tapes so the data was very similar). Regards Peter McMurray I've been working on B-trees sinced they were still called balanced trees, when there were a fixed number of indexes per node and that was it. You designed your tree with 1 block at the top, a second level with as many blocks as there were keys in the first block, etc until you ended up with the last layer which held one key:id pair for each record in the file. Theoretically, it never took more than levels + 1 read to get to any record. BTW, I use indexed a LOT in Visual Basic and DOT Net. It's much faster to read an index, put the display key into a pull-down or listbox and store the item id than to sselect the file and read every record to get the same information. Mark Brown |
#9
| |||
| |||
|
|
I appologize. When I saw your original post, I immediately jumped into my "test" account, created a file, dict items and data and tried everything, and, of course, it works perfectly. However, when I saw another of your posts where you say it does, and as I don't think you'd lie about it, I went back and to my shagrin, I find that my test account is a VME account where indexing actually works. The problem is the NT FSI specific indexing. I easily reproduced your problem and you're absolutely right: that one just doesn't work. I still have a couple friends, so I'll send them a "reproducable case". Since they seem hell bent to move FSI into the *nix product line, this is something that has to be fixed. Since FSI file system is different, the index is also different. I know the people involved in creating it and I won't say a bad word about them, but this is one place where they really let us down. As to a-correlatives, I came to them late in life, too; but now I love them. Since you can basically stack stuff forever, you can make dict items like this (short examples; but you'll get the idea): a0:N(CO)(R%2) : N(PRODUCT)(R%6) : N(PACK)(R%8) At least, you only need to do it once and you could probably actually create a little Pick Basic routine to prompt for that stuff and build a new dict item for you. That's what I'd do. Your "key" data is only 50 bytes long and, with all that, I'd assume probably pretty unique. But you would probably be just as well served with some kind of a file-time trigger to update cross-reference file with this data, with the key as the "key" then a simple index on a0. So something like this would work: select stkmve.xref with a0 "something*meaningful" select stkmve.xref a1 first select gets the keys that match and the second returns all ids Mark Here are some examples. The first two are on the stock movement file and all the information is contained within the stock movement record. Enquiries cannot wait for a full sort of the file. An algebraic dictionary is going to be a mess. More importantly the second will only exist for Stock Purchases and Returns. STKMVEBT1*1 001 Stock Movements By Product & Pack - Btree 002 003 STKMVEBT1F 004 KEY 005 CO "R%2" 006 PRODUCT "R%6" 007 PACK "R%8" 008 DEPOT "L#4" 009 LOCCODE "L#4" 010 PERNO "R%7" 011 MVETYPE "R%2" 012 MVEDAT "R%5" 013 SOURCECO "R%2" 014 BATCHNO "R%6" 015 BLINE "R%3" 016 BATCHTYPE "L#1" : STKMVEBT2*1 001 Stock Movements From Suppliers - Btree 002 003 STKMVEBT2F 004 KEY 005 MVESOURCE "L#8" 006 MVEDOC "R%8" 007 SOURCECO "R%2" 008 DEPOT "L#4" 009 BATCHNO "R%6" 010 BLINE "R%2" : This combines Name information from the Debtors master file, which controls credit, with Address information from the Address file which controls pricing and delivery information. The Aboriginal centre may have deliveries to 40 or 50 settlements, the Seventh Day Adventists may have deliveries to a dozen of their pastors, ColesMyer could have 700 to 800 delivery points. NAMEBT*1 001 The Name Enquiry Btree File 002 003 NAMEBTF 004 KEY 005 SORTNAME "L#10" 006 CO "R%2" 007 NUMBER "R%6" 008 ADDNO "R%3" This allows us to check delivery requests by street. Dodgy debtors often come in with different names but the same delivery address - it is not much good having your heating oil delivered to somone else's tank. STREETBT*1 001 The Street Enquiry Btree File 002 003 STREETBTF 004 KEY 005 SORTSTREET "L#10" 006 ADDNUM "R%5" 007 CO "R%2" 008 NUMBER "R%6" 009 ADDNO "R%3" : "Mark Brown" <mbrown (AT) drexelmgt (DOT) com> wrote in message news:v60Ag.8218$Vq1.4667 (AT) tornado (DOT) socal.rr.com... "Peter McMurray" <excalibur21 (AT) bigpond (DOT) com> wrote in message news:4sZzg.5215$rP1.4602 (AT) news-server (DOT) bigpond.net.au... Hi Mark Thanks for the rapid response ( DO you work all night?) Unfortunately the suggested correlative does not work in SORT BY-EXP as my original did. Nor does the new method bring back a sorted list with say SORT INVOICES BY-EXP GRPDESC "052]" it just comes back with no items present. In trying to look at the key I could be wrong with my code but using the 'N' option to race through all the keys it appears to have just made a list of all the invoice lines unsorted. Here's my test file. Only three records and not much data :list test pd Page 1 test 04:20:56 02 Aug 2006 test...... Prod Desc..................... 2 003product 3 3 001product 1 003product 3 1 002product 2 001product 1 [405] 3 items listed out of 3 items. :sort test pd by pd Page 1 test 04:21:18 02 Aug 2006 test...... Prod Desc..................... 3 001product 1 003product 3 1 002product 2 001product 1 2 003product 3 [4051] 3 items listed. :sort test pd by-exp pd Page 1 test 04:21:22 02 Aug 2006 test...... Prod Desc..................... 1 001product 1 3 001product 1 1 002product 2 2 003product 3 3 003product 3 [4051] 5 items listed. My a-corellative was a1(TPROD;X;3;3)(MR%3):1(TPROD;X;2;2) I'm running 7.4.4.400 If I take into account the enormously messy way I have to actually reproduce the English correlative in Basic and I would typically have upto 9 or 10 attributes in a btree key. I am wondering if it is worthwhile on anything but the most simplistic sorts. I'd like to see something like that. Usually I find we tend to over-complicate things. By the way I meant no disrespect to you. In fact I have fond memories of the best presentation ever given over a couple of days was yours on Microsoft interfacing. I don't remember who it was that answered my original query but they certainly avoided complex keys that I specifically asked about. I certainly have a better understanding of what Pick is doing now from your post and it is not what I was expecting. Perhaps you could volunteer a complex English correlative that I could work through Don't worry (be happy). I get plenty of disrespect, so I never feel short changed. How about: aa (correl: tprod;x;3;3) and bb (tprod;x;2;2) and then you can an(aa)(mr%3):n(bb) As for Btrees I have been using them for nigh on 25 years as Micromax had a brilliant implementation and I sold the first of the Micromax 3000's of the convention floor in 1982. The B-trees worked so well the client ran around telling everybody how his new system was 10 times faster than his direct IBM mainframe link to BP with the same data (we actually picked up 25,000 customers from the mainframe tapes so the data was very similar). Regards Peter McMurray I've been working on B-trees sinced they were still called balanced trees, when there were a fixed number of indexes per node and that was it. You designed your tree with 1 block at the top, a second level with as many blocks as there were keys in the first block, etc until you ended up with the last layer which held one key:id pair for each record in the file. Theoretically, it never took more than levels + 1 read to get to any record. BTW, I use indexed a LOT in Visual Basic and DOT Net. It's much faster to read an index, put the display key into a pull-down or listbox and store the item id than to sselect the file and read every record to get the same information. Mark Brown |
#10
| |||
| |||
|
|
However, when I saw another of your posts where you say it does, and as I don't think you'd lie about it, I went back and to my shagrin, I find that my test account is a VME account where indexing actually works. |
|
The problem is the NT FSI specific indexing. I easily reproduced your problem and you're absolutely right: that one just doesn't work. |
|
As to a-correlatives, I came to them late in life, too; but now I love them. Since you can basically stack stuff forever, you can make dict items like this (short examples; but you'll get the idea): a0:N(CO)(R%2) : N(PRODUCT)(R%6) : N(PACK)(R%8) |
![]() |
| Thread Tools | |
| Display Modes | |
| |