![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
When I imported data into FM from another database program, some of the accented characters were changed into other ASCII or Unicode symbols. Is there any way I can do a find in a field to locate all records with any characters or symbols above the ASCII or Unicode range of standard English letters and punctuation marks without having to do separate finds for each individual symbol? -- Cecil N. Bankston Baton Rouge, LA USA |
#3
| |||||
| |||||
|
|
Did you select the appropriate option in the "Character set:" drop-down menu during import? Did you try any of the other options to see if they helped? Are there other export options from the source database? Re-importing the data is likely the easiest way to correct this, if it is possible. |
|
Edit-->Find/Replace will require one operation per substitution. Do you have a complete list of the incorrect/correct symbols? |
|
could write a calculation: Substitute(yourField; "a"; "A"; "b"; "B"; "c"; "C") Where the lowercase letter represents the incorrect symbol and the uppercase letter represents the correct one. Your version would have a lot more than three pairs, however. You would have to do this for each and every field that could contain the incorrect characters. |
|
(You implement this as a calculated field mirroring the imported field; or you apply this as an auto-enter rule during import; or as a Replace with calculated result using the Replace Field contents command.) Unfortunately, if the data is incorrect in the source file I don't know of another way to correct this. |
|
Bill "Cecil Bankston" <cbankston (AT) spamfreecox (DOT) net> wrote in message news:wFQif.11966$mm5.504 (AT) dukeread03 (DOT) .. When I imported data into FM from another database program, some of the accented characters were changed into other ASCII or Unicode symbols. Is there any way I can do a find in a field to locate all records with any characters or symbols above the ASCII or Unicode range of standard English letters and punctuation marks without having to do separate finds for each individual symbol? |
#4
| |||
| |||
|
|
Thanks for the reply. Bill Marriott wrote: Did you select the appropriate option in the "Character set:" drop-down menu during import? Did you try any of the other options to see if they helped? Are there other export options from the source database? Re-importing the data is likely the easiest way to correct this, if it is possible. The data was imported years ago from a Superbase file on an Amiga computer, so re-importing is not an option. Edit-->Find/Replace will require one operation per substitution. Do you have a complete list of the incorrect/correct symbols? Unfortunately, no. If so, you could write a calculation: Substitute(yourField; "a"; "A"; "b"; "B"; "c"; "C") Where the lowercase letter represents the incorrect symbol and the uppercase letter represents the correct one. Your version would have a lot more than three pairs, however. You would have to do this for each and every field that could contain the incorrect characters. The database is large, so I recognize the incorrect characters only when I happen to view a particular record that contains them. At that point I can generally tell or find out what the correct character is supposed to be and correct it manually. The problem is finding the records that need correcting without having to browse through thousands of records. That is why I hoped there was a means of finding any characters or symbols above the ASCII or Unicode range of standard English letters and punctuation marks. That found set would be much smaller than the entire file and would include all the erroneous characters. (You implement this as a calculated field mirroring the imported field; or you apply this as an auto-enter rule during import; or as a Replace with calculated result using the Replace Field contents command.) Unfortunately, if the data is incorrect in the source file I don't know of another way to correct this. The source data was correct. I expect the translation from AmigaDOS to Windows was the problem. Bill "Cecil Bankston" <cbankston (AT) spamfreecox (DOT) net> wrote in message news:wFQif.11966$mm5.504 (AT) dukeread03 (DOT) .. When I imported data into FM from another database program, some of the accented characters were changed into other ASCII or Unicode symbols. Is there any way I can do a find in a field to locate all records with any characters or symbols above the ASCII or Unicode range of standard English letters and punctuation marks without having to do separate finds for each individual symbol? |
#5
| |||
| |||
|
|
When I imported data into FM from another database program, some of the accented characters were changed into other ASCII or Unicode symbols. Is there any way I can do a find in a field to locate all records with any characters or symbols above the ASCII or Unicode range of standard English letters and punctuation marks without having to do separate finds for each individual symbol? It should be possible to attack the problem by brute force defining an allowed range, and then exclude that... e.g. Define a calculation: Substitute (your field, "a", "", "b", "", "c", "", "d", "", ... "z", "", "A", "", "B", ... "Z", "", "0", "", ... "9", "", then comma, period, question mark, ampersand, percent, colon, semicolon, dollar sign, asterisk, parenthesis (l+r), brackets (l+r), braces (l+r), angle brackets (l+r), plus, minus, equals, exclamation, at, caret, tilde, "space", "tab", "carriage return", slash, backslash, pound, underscore, quote, apostrophe, left-apostrophe, pipe... All told I'd wager there are only around 75 characters. [Alphabet (26x2), digits (10), symbols (~20) (Although, you could probably get away with omitting most of those symbols, and just adding the ones you need to the calc expression if you need them) This calc ideally should be blank for all records once you've cleaned everything up. So...do a find on that field for "=" (find anything) And see what turns up. Keep "fixxing" characters, until nothing turns up in that find anymore. Then you are done. |
#6
| |||||
| |||||
|
|
The brute force calculation method specified below probably would work. I'll give it a try. I will rephrase the question to a more generalized one of how or if one can use find requests to locate records containing a list or range (as opposed to individual) of literal special characters (accented letters or symbols) occurring anywhere in a field. I expect it probably can't be done. |
|
For example: *"é"* will find é (accented e in case the server doesn't show the accented character I inserted between the quotes) |
|
anywhere in a field. If I enter *"é"*...*"ö"* (umlaut o) to try to find all the accented or special characters between é and ö, nothing is found. IfI enter é...ö all records with words beginning with any non-accented character between e and o are found. I know I could use the brute force method of separate find requests for each letter, if FM can handle that many requests in one find. |
|
Of course that would require knowing all the possible characters to include, which I don't. |


|
It should be possible to attack the problem by brute force defining an allowed range, and then exclude that... e.g. Define a calculation: Substitute (your field, "a", "", "b", "", "c", "", "d", "", ... "z", "", "A", "", "B", ... "Z", "", "0", "", ... "9", "", then comma, period, question mark, ampersand, percent, colon, semicolon, dollar sign, asterisk, parenthesis (l+r), brackets (l+r), braces (l+r), angle brackets (l+r), plus, minus, equals, exclamation, at, caret, tilde, "space", "tab", "carriage return", slash, backslash, pound, underscore, quote, apostrophe, left-apostrophe, pipe... All told I'd wager there are only around 75 characters. [Alphabet (26x2), digits (10), symbols (~20) (Although, you could probably get away with omitting most of those symbols, and just adding the ones you need to the calc expression if you need them) This calc ideally should be blank for all records once you've cleaned everything up. So...do a find on that field for "=" (find anything) And see what turns up. Keep "fixxing" characters, until nothing turns up in that find anymore. Then you are done. |
#7
| |||
| |||
|
|
In article <7Aojf.195$oz5.167@dukeread03>, cbankston (AT) spamfreecox (DOT) net says... The brute force calculation method specified below probably would work. I'll give it a try. I will rephrase the question to a more generalized one of how or if one can use find requests to locate records containing a list or range (as opposed to individual) of literal special characters (accented letters or symbols) occurring anywhere in a field. I expect it probably can't be done. List yes, range no. For example: *"é"* will find é (accented e in case the server doesn't show the accented character I inserted between the quotes) Actually just the 'e' by itself will work, the *"e"* construct is unnecessary. |
#8
| |||
| |||
|
|
In article <MPG.1df7c1687fa08f00989def (AT) shawnews (DOT) vf.shawcable.net>, nospam (AT) nospam (DOT) com says... In article <7Aojf.195$oz5.167@dukeread03>, cbankston (AT) spamfreecox (DOT) net says... The brute force calculation method specified below probably would work. I'll give it a try. I will rephrase the question to a more generalized one of how or if one can use find requests to locate records containing a list or range (as opposed to individual) of literal special characters (accented letters or symbols) occurring anywhere in a field. I expect it probably can't be done. List yes, range no. For example: *"é"* will find é (accented e in case the server doesn't show the accented character I inserted between the quotes) Actually just the 'e' by itself will work, the *"e"* construct is unnecessary. Oops... no. You need the *'s but not the quotes. *e* works. |
|
"ö" (that's o with an umlaut) finds all records with words beginning with characters >o (that's o with no accent), disregarding the order of |
#9
| |||
| |||
|
|
Regular expressions would make the brute force method much simpler because you could easily specify it to be case insensitive and set the range for alphabet [a...z], and digits [0..9], and then would only have to manually specify the acceptable symbols. But regex are only available via plugin. |
#10
| |||
| |||
|
|
Regular expressions would make the brute force method much simpler because you could easily specify it to be case insensitive and set the range for alphabet [a...z], and digits [0..9], and then would only have to manually specify the acceptable symbols. But regex are only available via plugin. Is there a particular plugin for FM7 you had in mind? -- Cecil N. Bankston Baton Rouge, LA USA |
![]() |
| Thread Tools | |
| Display Modes | |
| |