dbTalk Databases Forums  

Optimizing extraction of elements from large dyn arrays

comp.databases.pick comp.databases.pick


Discuss Optimizing extraction of elements from large dyn arrays in the comp.databases.pick forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
iJah
 
Posts: n/a

Default Optimizing extraction of elements from large dyn arrays - 07-29-2005 , 01:41 PM







I'm looking for some way to improve performance of data extraction
from large dynamic arrays - specifically in Advanced Pick.

The mechanics of what I'm working with are already deeply ingrained in
the application software and I'm not going to re-invent the wheel, but
the sitch is that reports get written to a single record in one
big/fat attribute mark delimited array, then each line (attribute) has
to be extracted and parsed and handled for a variety of reasons.

This is being done all over with the usual

N=DCOUNT(REC,CHAR(254)
FOR I=1 TO N
LINE = REC<I>
NEXT I

obviously the deeper you get into the array the slower the crawl.

in Universe i'd just use the 'REMOVE' statement and it would fly.

i've tried advanced picks flavor of remove - i think the syntax is
like 'remove element at position from array setting delimiter.code'
but i just don't seem to get much if any performance boost using that
method.

any clues for how to turbo charge this kind of extraction?



Reply With Quote
  #2  
Old   
frosty
 
Posts: n/a

Default Re: Optimizing extraction of elements from large dyn arrays - 07-29-2005 , 01:43 PM






iJah wrote:
Quote:
I'm looking for some way to improve performance of data extraction
from large dynamic arrays - specifically in Advanced Pick.

The mechanics of what I'm working with are already deeply ingrained in
the application software and I'm not going to re-invent the wheel, but
the sitch is that reports get written to a single record in one
big/fat attribute mark delimited array, then each line (attribute) has
to be extracted and parsed and handled for a variety of reasons.

This is being done all over with the usual

N=DCOUNT(REC,CHAR(254)
FOR I=1 TO N
LINE = REC<I
NEXT I

obviously the deeper you get into the array the slower the crawl.

in Universe i'd just use the 'REMOVE' statement and it would fly.

i've tried advanced picks flavor of remove - i think the syntax is
like 'remove element at position from array setting delimiter.code'
but i just don't seem to get much if any performance boost using that
method.

any clues for how to turbo charge this kind of extraction?
Write the dynamic array to a workfile, then MATREAD it back in
(as a dimensioned array.) There might be a more direct way to
do this in BASIC, like MATPARSE or similar? (Been a long time
since I've used A/P.)

--
frosty




Reply With Quote
  #3  
Old   
Marvin Fisher
 
Posts: n/a

Default Re: Optimizing extraction of elements from large dyn arrays - 07-29-2005 , 02:02 PM



Get the following a shot

VALCNT = DCOUNT(dyn.array, @VM)
DIM dim.array(VALCNT)
MATPARSE dim.array FROM dyn.array using @VM


"iJah" <iJah (AT) sbcglobal (DOT) net> wrote

Quote:
I'm looking for some way to improve performance of data extraction
from large dynamic arrays - specifically in Advanced Pick.

The mechanics of what I'm working with are already deeply ingrained in
the application software and I'm not going to re-invent the wheel, but
the sitch is that reports get written to a single record in one
big/fat attribute mark delimited array, then each line (attribute) has
to be extracted and parsed and handled for a variety of reasons.

This is being done all over with the usual

N=DCOUNT(REC,CHAR(254)
FOR I=1 TO N
LINE = REC<I
NEXT I

obviously the deeper you get into the array the slower the crawl.

in Universe i'd just use the 'REMOVE' statement and it would fly.

i've tried advanced picks flavor of remove - i think the syntax is
like 'remove element at position from array setting delimiter.code'
but i just don't seem to get much if any performance boost using that
method.

any clues for how to turbo charge this kind of extraction?





Reply With Quote
  #4  
Old   
Dale Benedict
 
Posts: n/a

Default Re: Optimizing extraction of elements from large dyn arrays - 07-29-2005 , 02:06 PM




"frosty" <frosty (AT) bogus (DOT) tld> wrote

Quote:
iJah wrote:
I'm looking for some way to improve performance of data extraction
from large dynamic arrays - specifically in Advanced Pick.

The mechanics of what I'm working with are already deeply ingrained in
the application software and I'm not going to re-invent the wheel, but
the sitch is that reports get written to a single record in one
big/fat attribute mark delimited array, then each line (attribute) has
to be extracted and parsed and handled for a variety of reasons.

This is being done all over with the usual

N=DCOUNT(REC,CHAR(254)
FOR I=1 TO N
LINE = REC<I
NEXT I

obviously the deeper you get into the array the slower the crawl.

in Universe i'd just use the 'REMOVE' statement and it would fly.

i've tried advanced picks flavor of remove - i think the syntax is
like 'remove element at position from array setting delimiter.code'
but i just don't seem to get much if any performance boost using that
method.

any clues for how to turbo charge this kind of extraction?

Write the dynamic array to a workfile, then MATREAD it back in
(as a dimensioned array.) There might be a more direct way to
do this in BASIC, like MATPARSE or similar? (Been a long time
since I've used A/P.)

--
frosty


Frosty's idea might be your easiest solution.

If the report record only contains attribute marks, or each attribute has
the exact same structure and number of values and sub-values with in each
value, then you could write the record to disk and the perform a qselect of
the item from disk. The just use a readnext statement the get each
delimited piece of data.

I find that not to many people realize the easy, power, and speed of doing
qselects.

Regards,

Dale




Reply With Quote
  #5  
Old   
Joe
 
Posts: n/a

Default Re: Optimizing extraction of elements from large dyn arrays - 07-29-2005 , 03:07 PM



iJah <iJah (AT) sbcglobal (DOT) net> wrote in
news:imtke199v33u7u1qurve4capne5emtmmvt (AT) 4ax (DOT) com:

Quote:
I'm looking for some way to improve performance of data extraction
from large dynamic arrays - specifically in Advanced Pick.

The mechanics of what I'm working with are already deeply ingrained
in
the application software and I'm not going to re-invent the wheel,
but
the sitch is that reports get written to a single record in one
big/fat attribute mark delimited array, then each line (attribute)
has
to be extracted and parsed and handled for a variety of reasons.

This is being done all over with the usual

N=DCOUNT(REC,CHAR(254)
FOR I=1 TO N
LINE = REC<I
NEXT I

obviously the deeper you get into the array the slower the crawl.

in Universe i'd just use the 'REMOVE' statement and it would fly.

i've tried advanced picks flavor of remove - i think the syntax is
like 'remove element at position from array setting delimiter.code'
but i just don't seem to get much if any performance boost using
that
method.

any clues for how to turbo charge this kind of extraction?
If you've already got the array in memory, can't you simply SELECT the
array, then do a LOOP WHILE READNEXT thing?

SELECT REC
LOOP
WHILE READNEXT LINE DO
whatever...
REPEAT

The syntax may be different for your platform, but you get the idea...

Regards,
Joe


Reply With Quote
  #6  
Old   
douglas@pickteam.com
 
Posts: n/a

Default Re: Optimizing extraction of elements from large dyn arrays - 07-29-2005 , 03:35 PM



So AP didn't have the REMOVE verb. It's a great tool.


Reply With Quote
  #7  
Old   
Tony Gravagno
 
Posts: n/a

Default Re: Optimizing extraction of elements from large dyn arrays - 07-29-2005 , 04:06 PM



Depending on the structure of your data you can cut it up into more
manageable pieces. Rough code example off the cuff, needs refinement:

****
read blob from file,key
blocksize = 10000
append = ""
loop
* work with a small block here, not the entire blob
block = append : blob[blocksize*bnum+1,blocksize]
* note this will grab some bytes in the first
* atb of the next blob
until block = "" do
gosub process.block
bnum += 1
repeat
stop
process.block:
ct = dcount(block,@am)
* ignore last atb which is part of the next blob
for atb = 1 to ct
line = block<atb>
if atb = ct then
append = line
return
end
if line = "" then return
gosub process.line ; * not included here
next atb
return
****

If the blob is some number of megabytes large, then write it out to
the host file system and use socket functions to pull it in as smaller
blocks - reading the whole thing into BASIC just messes with memory
and overflow. Depending on the data it may not be a good idea to
store them in the MV file system anyway.

HTH
Tony


iJah <iJah (AT) sbcglobal (DOT) net> wrote:

Quote:
I'm looking for some way to improve performance of data extraction
from large dynamic arrays - specifically in Advanced Pick.

The mechanics of what I'm working with are already deeply ingrained in
the application software and I'm not going to re-invent the wheel, but
the sitch is that reports get written to a single record in one
big/fat attribute mark delimited array, then each line (attribute) has
to be extracted and parsed and handled for a variety of reasons.

This is being done all over with the usual

N=DCOUNT(REC,CHAR(254)
FOR I=1 TO N
LINE = REC<I
NEXT I

obviously the deeper you get into the array the slower the crawl.

in Universe i'd just use the 'REMOVE' statement and it would fly.

i've tried advanced picks flavor of remove - i think the syntax is
like 'remove element at position from array setting delimiter.code'
but i just don't seem to get much if any performance boost using that
method.

any clues for how to turbo charge this kind of extraction?



Reply With Quote
  #8  
Old   
Luke Webber
 
Posts: n/a

Default Re: Optimizing extraction of elements from large dyn arrays - 07-29-2005 , 05:58 PM



Dale Benedict wrote:

Quote:
Frosty's idea might be your easiest solution.

If the report record only contains attribute marks, or each attribute has
the exact same structure and number of values and sub-values with in each
value, then you could write the record to disk and the perform a qselect of
the item from disk. The just use a readnext statement the get each
delimited piece of data.

I find that not to many people realize the easy, power, and speed of doing
qselects.
In fact, you don't even need to do the write and qselect. Just something
like this...

SELECT REC TO SELVBL
LOOP
READNEXT LINE FROM SELVBL ELSE EXIT
...
REPEAT

Even simpler than the original version. Almost elegant.

I use this all the time, as well as the QSELECT version, and it's
blazing fast. Sometime I use QSELECT to process flat input files, rather
than %open/%read, because that saves me the need to use FLASH
compilation. FLASH compilation can be a pain if you need to call of
subroutines that aren't FLASH compiled.

Luke


Reply With Quote
  #9  
Old   
Luke Webber
 
Posts: n/a

Default Re: Optimizing extraction of elements from large dyn arrays - 07-29-2005 , 06:03 PM



Joe wrote:

Quote:
If you've already got the array in memory, can't you simply SELECT the
array, then do a LOOP WHILE READNEXT thing?

SELECT REC
LOOP
WHILE READNEXT LINE DO
whatever...
REPEAT

The syntax may be different for your platform, but you get the idea...
Bingo. Almost exactly the same as my suggestion, the only difference
being that AP and D3 don't have WHILE READNEXT. You need to use READNEXT
ELSE EXIT instead.

To the OP, this is undoubtedly the fastest and the most elegant
solution. The MATPARSE suggestions won't come close to it, and I don't
think AP had array redimensioning, and I believe the size of AP arrays
was relatively limited.

Luke


Reply With Quote
  #10  
Old   
Frank Winans
 
Posts: n/a

Default Re: Optimizing extraction of elements from large dyn arrays - 07-29-2005 , 06:29 PM



"Joe"wrote
Quote:
iJah wrote
large dynamic arrays - in Advanced Pick.
This is being done all over with the usual

N=DCOUNT(REC,CHAR(254)
FOR I=1 TO N
LINE = REC<I
NEXT I

obviously the deeper you get into the array the slower the crawl.
any clues?

If you've already got the array in memory, can't you simply SELECT the
array, then do a LOOP WHILE READNEXT thing?

SELECT REC
LOOP
WHILE READNEXT LINE DO
whatever...
REPEAT

The syntax may be different for your platform, but you get the idea...

Regards,
Joe
* Close, but AP READNEXT wants an ELSE clause.
SELECT REC
LINE='dummy'
LOOP WHILE LINE # "sentryvalue"
READNEXT LINE ELSE LINE = "sentryvalue"
DO
whatever...
REPEAT
**** And just because it is so easy to code, I'd try Frosty's &
**** Marvin Fisher's idea of a dimensioned array, for a 'plan B'
worksize = DCOUNT(rec, @AM)
DIM work(worksize)
work = MAT rec
FOR J=1 to worksize
line = work(J)
NEXT J
* or code it as MATPARSE work FROM rec using @AM
***** Dale Benedict advocated QSELECT, but I just cannot
***** see the charm of REC to a disk item just so QSELECT can
***** read it back in. Just SELECT REC avoids a read/write pair.








Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.