dbTalk Databases Forums  

@AM and other bad characters in key in Universe

comp.databases.pick comp.databases.pick


Discuss @AM and other bad characters in key in Universe in the comp.databases.pick forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Steven L. Stein
 
Posts: n/a

Default @AM and other bad characters in key in Universe - 06-25-2003 , 01:31 PM






[This followup was posted to comp.databases.pick and a copy was sent to the cited author.]

We have a huge database (over 2,000,000 records) which has some keys
which contain bad characters (believed to be @AM [CHAR(254)]. I'm
attempting to delete these, but I've been unsuccessful in obtaining any
of them that are complete. For instance, if the key is "0-123~254~-56"
(where ~254~ is an attribute mark (CHAR(254)), I get "0-123" when I
readnext from an active select list. This is true regardless of whether
the list is from SELECT or from a saved list (SELECT followed by SAVE-
LIST followed by GET-LIST).

I tried two things a colleague gave me which were found on the Informix-
U2 listserve, but neither one works:

1. Create I-DESCRIPTOR ANYAM:
0001: I
0002: TRANS(filename,@ID,-1,'X'); IF INDEX(@1,@VM,1) THEN 'YES'
ELSE 'NO'

Note: testing form @VM since TRANS lowers the data

Create I-DESCRIPTOR AMID
0001: I
0002: LOWER(@ID)

Note: lower the @ID one, so when saved in a select list, the @AM
doesn't screw up the list

SELECT filename WITH ANYAM = "YES" SAVING UNIQUE AMID
SAVE-LIST BADONES

In a BASIC program, do READNEXT on select list, RAISE the ID (RAISE
(key), and delete it from the file.

2. Identify the bad keys by:

SELECT filename WITH @ID NE EVAL "CONVERT(@AM,']',@ID)" SAVING EVAL
"LOWER(LOWER(LOWER(@ID)))"
SAVE-LIST listname

In a BASIC program, do READNEXT on select list, RAISE the ID (RAISE
(RAISE(RAISE(key)))), and delete it from the file.

A few years ago, when we were on Advanced Pick (we're now on Universe
9.4 on SCO OpenServer 5), the only way to get rid of such records was to
create a new file, SELECT the old file, and COPY the old file to the new
one. The keys with the bad data wouldn't be found by the COPY, and so
the new file would contain all the good records and none of the bad.
You would then delete the old file and rename the new file (so it was
the same name as the old one just deleted).

Thanks

Steve

Reply With Quote
  #2  
Old   
rg
 
Posts: n/a

Default Re: @AM and other bad characters in key in Universe - 06-25-2003 , 05:57 PM






That might work with D3/Pick. But I think that Universe will perform a
standard select and therefore build a corrupted list, just as if it was a
separate procedure.

Hope I'm wrong, but I do know that a Universe Basic SELECT is much slower
then the old Pick Basic select. Because it's busy creating the list.

Roger
"JPORSHA924" <jporsha924 (AT) aol (DOT) com> wrote

Quote:
since a select list (whether saved or not) looks just like any other item,
if
there are am or vm in it, it will screw up what you think are the keys.

however - if you do a select within a basic program, this actually
sequentially
reads the keys and returns them one at a time - so bad/funny characters
should
go thru.

jay



Reply With Quote
  #3  
Old   
Martin Kent
 
Posts: n/a

Default Re: @AM and other bad characters in key in Universe - 06-26-2003 , 11:09 AM




The following code will identify keys containing @AM's. If each item of
the select list is not a key, then it must be part of a key
Limitations
(1) a key cannot be part of another key i.e the program could not locate
12345 @AM 67890 if keys 12345 or 67890 existed
(2) only handles duplicates containing @AMs
(3) locking and duplicate checking is needed
(4) assumes items are not being added / removed when this code runs
(5) anything else I have overlooked. Code is for illustrative purposes
only. No responsibility accepted for its use.

OPEN "DATAFILE" TO DATAFILE ELSE STOP
EXECUTE 'SELECT DATAFILE'
INNER = 0
AMS.REMOVED = 0
XKEY = ""
LOOP
READNEXT KEY ELSE INNER = 1
UNTIL INNER DO
IF XKEY THEN XKEY := @AM : KEY ELSE XKEY = KEY
READ RECORD FROM DATAFILE,XKEY THEN
* only write if it contains am's
IF COUNT(XKEY,@AM) > 0 THEN
NEW.XKEY = CHANGE(XKEY,@AM,"")
READ DUPREC FROM DATAFILE, NEW.XKEY THEN
* it's a duplicate - handle this however you have to
END ELSE
* try without writes for analysis purposes
* WRITE RECORD ON DATAFILE, CHANGE(XKEY,@AM,"")
* DELETE DATAFILE, XKEY
AMS.REMOVED += 1
END
END
XKEY = "" ; * reset XKEY
END ELSE
* it's part of a key - carry on
END
REPEAT
CRT 'AMS REMOVED ' : AMS.REMOVED

--
Posted via http://dbforums.com

Reply With Quote
  #4  
Old   
Bob Woodward
 
Posts: n/a

Default Re: @AM and other bad characters in key in Universe - 06-28-2003 , 10:47 PM



"rg" <rglenfld (AT) optonlinespam (DOT) net> wrote

Quote:
That might work with D3/Pick. But I think that Universe will perform a
standard select and therefore build a corrupted list, just as if it was a
separate procedure.

Hope I'm wrong, but I do know that a Universe Basic SELECT is much slower
then the old Pick Basic select. Because it's busy creating the list.

Roger
"JPORSHA924" <jporsha924 (AT) aol (DOT) com> wrote in message
news:20030625160144.13592.00001789 (AT) mb-m25 (DOT) aol.com...
since a select list (whether saved or not) looks just like any other
item,
if
there are am or vm in it, it will screw up what you think are the keys.

however - if you do a select within a basic program, this actually
sequentially
reads the keys and returns them one at a time - so bad/funny characters
should
go thru.

jay

If you use EXECUTE "SELECT MYDATA" then you are correct. If, on the
otherhand, you open the file to FH.MYDATA (FH=filehandle) then use the
command SELECT FH.MYDATA, this will perform the sequential read through the
whole file. DO NOT TRY TO MAKE A SELECT LIST. The list handler uses system
delimiters to divide key values, which is what you're trying to find. The
BASIC command SELECT (vs the TCL SELECT command) uses the blocks of data to
supply the key values. They will NOT be in any sorted order, which you
don't really care about. You just want to find the keys and pull those data
records out of the database. Here's a short snippet that you should be able
to modify to your needs. Add some reporting, if you like, and maybe a
couple of counters to display the progress as it will take awhile to go
through your file. Again, this is not complete but should get you on the
right track of how to find and get rid of the unwanted data.

CHKVALS=CHAR(255):CHAR(254):CHAR(253):CHAR(252)
OPEN "MYDATA" TO FH.MYDATA
SELECT FH.MYDATA
EOF=""
LOOP
....READNEXT IDKEY ELSE EOF = 1
UNTIL EOF
....CLEAN.KEY = IDKEY
....CONVERT CHKVALS TO "" IN CLEAN.KEY
....IF CLEAN.KEY = IDKEY THEN
.......GOODKEY.CNT += 1
....END ELSE
.......BADKEY.CNT += 1
.......DELETE FH.MYDATA, IDKEY
....END
REPEAT

You'll have to check the CONVERT statement but what you're trying to do with
that statement is to take the contents of the keyvalue just obtained, strip
out anything that is not suppose to be there by replacing them with null's
then comparing the results to the original key. If they are different,
you've found a bad key. Delete the record, or read it and write it into an
"audit" file, being sure to modify the key to REPRESENT the delimiters.
Then delete the record using, again, the original key value obtained in the
READNEXT statement.

The usual cause of this is inproper key values being created from
multivalued data. Programs that think a field should be single valued being
used as a key for another file, or xlate type data being used as a source
for key fields from a multi-valued list.

Good luck.
Bob Woodward




Reply With Quote
  #5  
Old   
Ross Ferris
 
Posts: n/a

Default Re: @AM and other bad characters in key in Universe - 06-29-2003 , 05:10 AM



Steven,

What's wrong with the COPY technique you used in D3 ?

I would thing that a variation of the "Drumheller Trick" should fit
the bill though. You simply have an I-Type dictionary item that, in
the first instance, simply checks @ID for "crap characters" using real
live basic code, and converts "bad" characters to a printable
equivalent, and prints this (and other key information) out.

A key that doesn't contain bad characters returns a "null".

Phase 1, assuming your new dictionary item is called BADKEY, is to
simply

SELECT MYFILE WITH BADKEY # ""

This will do 2 things .... the SELECT may (or may not) return a list
of keys that are bad .... the more important thing is that your I-Type
basic code will actually print out the information you decided to
capture/display

Phase 2 is then to modify your I-type so that rather than printing @ID
and @RECORD information, if a key is "bad", you open up the file, and
delete the record using @ID.

I can not verify that this will work ..... but it sure sounds good to
me :-)


Regards,



Ross Ferris
Stamina Software
Visage - an Evolution in Software Development






Steven L. Stein <sls8 (AT) cornell (DOT) edu> wrote

Quote:
[This followup was posted to comp.databases.pick and a copy was sent to the cited author.]

We have a huge database (over 2,000,000 records) which has some keys
which contain bad characters (believed to be @AM [CHAR(254)]. I'm
attempting to delete these, but I've been unsuccessful in obtaining any
of them that are complete. For instance, if the key is "0-123~254~-56"
(where ~254~ is an attribute mark (CHAR(254)), I get "0-123" when I
readnext from an active select list. This is true regardless of whether
the list is from SELECT or from a saved list (SELECT followed by SAVE-
LIST followed by GET-LIST).

I tried two things a colleague gave me which were found on the Informix-
U2 listserve, but neither one works:

1. Create I-DESCRIPTOR ANYAM:
0001: I
0002: TRANS(filename,@ID,-1,'X'); IF INDEX(@1,@VM,1) THEN 'YES'
ELSE 'NO'

Note: testing form @VM since TRANS lowers the data

Create I-DESCRIPTOR AMID
0001: I
0002: LOWER(@ID)

Note: lower the @ID one, so when saved in a select list, the @AM
doesn't screw up the list

SELECT filename WITH ANYAM = "YES" SAVING UNIQUE AMID
SAVE-LIST BADONES

In a BASIC program, do READNEXT on select list, RAISE the ID (RAISE
(key), and delete it from the file.

2. Identify the bad keys by:

SELECT filename WITH @ID NE EVAL "CONVERT(@AM,']',@ID)" SAVING EVAL
"LOWER(LOWER(LOWER(@ID)))"
SAVE-LIST listname

In a BASIC program, do READNEXT on select list, RAISE the ID (RAISE
(RAISE(RAISE(key)))), and delete it from the file.

A few years ago, when we were on Advanced Pick (we're now on Universe
9.4 on SCO OpenServer 5), the only way to get rid of such records was to
create a new file, SELECT the old file, and COPY the old file to the new
one. The keys with the bad data wouldn't be found by the COPY, and so
the new file would contain all the good records and none of the bad.
You would then delete the old file and rename the new file (so it was
the same name as the old one just deleted).

Thanks

Steve

Reply With Quote
  #6  
Old   
Martin Kent
 
Posts: n/a

Default Re: @AM and other bad characters in key in Universe - 06-30-2003 , 03:10 AM




A select...readnext in Universe cannot retrieve a key containing an @AM
- it will always return such a key as two separate keys, so dictionary
items to convert @AM to something else will not work. This is true of a
BASIC SELECT / EXECUTE 'SELECT' or a TCL SELECT.

--
Posted via http://dbforums.com

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.