dbTalk Databases Forums  

similarity of names

comp.databases.oracle.misc comp.databases.oracle.misc


Discuss similarity of names in the comp.databases.oracle.misc forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Norbert Winkler
 
Posts: n/a

Default similarity of names - 03-09-2009 , 08:09 AM






Hi,

we are using this construct
replace(replace(replace(replace(replace(replace(re place(replace(replace(replace(
replace(replace(replace(
lower(ANY_NAME),
' ',''),',',''),'ä','a'),'ü','u'),'ö','o'),'ß','s'), 'oe','o'
),'ue','u' ),'ae','a'),
'y','i'),'ll','l'),'nn','n'),'ss','s')

for comparing a given person-name with our database-person-names

is there a better solution or are the multiple replaces fast enough.
In my programms I iterate char by char and substitute using fast
case-constructs for substitution.
But I don't know how fast a char-by-char-solution in a PL/SQL-function
would be.

--
Norbert
Oracle9i Enterprise Edition Release 9.2.0.8.0 64Bit

Reply With Quote
  #2  
Old   
CarlosAL
 
Posts: n/a

Default Re: similarity of names - 03-09-2009 , 08:31 AM






On 9 mar, 15:09, Norbert Winkler <norbert.winkl... (AT) gmx (DOT) de> wrote:
Quote:
Hi,

we are using this construct
replace(replace(replace(replace(replace(replace(re place(replace(replace(replace(
* * * * replace(replace(replace(
* * * * * lower(ANY_NAME),
* * * * ' ',''),',',''),'ä','a'),'ü','u'),'ö','o'),'ß','s'), 'oe','o'
),'ue','u' ),'ae','a'),
* * * * 'y','i'),'ll','l'),'nn','n'),'ss','s')

for comparing a given person-name with our database-person-names

is there a better solution or are the multiple replaces fast enough.
In my programms I iterate char by char and substitute using fast
case-constructs for substitution.
But I don't know how fast a char-by-char-solution in a PL/SQL-function
would be.

--
Norbert
Oracle9i Enterprise Edition Release 9.2.0.8.0 64Bit
TRANSLATE() ?

HTH

Cheers.

Carlos.


Reply With Quote
  #3  
Old   
fransh@hotmail.com
 
Posts: n/a

Default Re: similarity of names - 03-09-2009 , 08:52 AM



On 9 mrt, 15:09, Norbert Winkler <norbert.winkl... (AT) gmx (DOT) de> wrote:
Quote:
Hi,

we are using this construct
replace(replace(replace(replace(replace(replace(re place(replace(replace(rep*lace(
* * * * replace(replace(replace(
* * * * * lower(ANY_NAME),
* * * * ' ',''),',',''),'ä','a'),'ü','u'),'ö','o'),'ß','s'), 'oe','o'
),'ue','u' ),'ae','a'),
* * * * 'y','i'),'ll','l'),'nn','n'),'ss','s')

for comparing a given person-name with our database-person-names

is there a better solution or are the multiple replaces fast enough.
In my programms I iterate char by char and substitute using fast
case-constructs for substitution.
But I don't know how fast a char-by-char-solution in a PL/SQL-function
would be.

--
Norbert
Oracle9i Enterprise Edition Release 9.2.0.8.0 64Bit
UTL_MATCH?

hth,
Frans H.,


Reply With Quote
  #4  
Old   
Larry W. Virden
 
Posts: n/a

Default Re: similarity of names - 03-10-2009 , 10:57 AM



On Mar 9, 10:52*am, fra... (AT) hotmail (DOT) com wrote:

Have you checked into the pl/sql soundex function? According to the
pages that came up in google, this might be worth trying if the names
being compared are likely to be phonetically similar.


Reply With Quote
  #5  
Old   
Norbert Winkler
 
Posts: n/a

Default Re: similarity of names - 03-11-2009 , 07:59 AM



Am Mon, 9 Mar 2009 07:52:48 -0700 (PDT) schrieb fransh (AT) hotmail (DOT) com:

Quote:
UTL_MATCH
Thanks for advising to this, the "Levenshtein Distance" is IMHO the best
solution for similarity, I did'nt know that it exists in Oracle.

And in
http://www.oracle.com/technology/boo...l%20ch%201.pdf
I see it's the well known algorithm they iterate in PL/SQL.
So I could give it a try to compare the speed of replace(replace(..
by a PL/SQL-cylce.

btw.
TRANSLATE() isn't a solution because I want to change German Umlaut Ä -->
AE (1 char into 2 chars)


--
Norbert
Oracle9i Enterprise Edition Release 9.2.0.8.0 64Bit


Reply With Quote
  #6  
Old   
Michael Austin
 
Posts: n/a

Default Re: similarity of names - 03-12-2009 , 07:28 PM



Norbert Winkler wrote:
Quote:
Am Mon, 9 Mar 2009 07:52:48 -0700 (PDT) schrieb fransh (AT) hotmail (DOT) com:

UTL_MATCH

Thanks for advising to this, the "Levenshtein Distance" is IMHO the best
solution for similarity, I did'nt know that it exists in Oracle.

And in
http://www.oracle.com/technology/boo...l%20ch%201.pdf
I see it's the well known algorithm they iterate in PL/SQL.
So I could give it a try to compare the speed of replace(replace(..
by a PL/SQL-cylce.

btw.
TRANSLATE() isn't a solution because I want to change German Umlaut Ä --
AE (1 char into 2 chars)



Norbert, I see in your sig:
"Oracle9i Enterprise Edition Release 9.2.0.8.0 64Bit"

isn't it about time to upgrade. Now that Oracle is no longer back
porting security patches to this version, it should be considered for
your critical systems.


Reply With Quote
  #7  
Old   
Norbert Winkler
 
Posts: n/a

Default Re: similarity of names - 03-17-2009 , 09:05 AM



Am Thu, 12 Mar 2009 20:28:29 -0500 schrieb Michael Austin:

Quote:
Norbert Winkler wrote:

Norbert, I see in your sig:
"Oracle9i Enterprise Edition Release 9.2.0.8.0 64Bit"

isn't it about time to upgrade. Now that Oracle is no longer back
porting security patches to this version, it should be considered for
your critical systems.
It's right - I'd prefer a version with implemented regular expression or
the Levensthein-algorith. But I'm only a user.
And the others like "never touch a running system" ....



--
Norbert
Oracle9i Enterprise Edition Release 9.2.0.8.0 64Bit


Reply With Quote
  #8  
Old   
Paulie
 
Posts: n/a

Default Re: similarity of names - 03-18-2009 , 06:36 AM





On Mar 17, 3:05 pm, Norbert Winkler <norbert.winkl... (AT) gmx (DOT) de> wrote:


Quote:
It's right - I'd prefer a version with implemented regular expression or
the Levensthein-algorith. But I'm only a user.
And the others like "never touch a running system" ....

"Why bother abolishing slavery - it's worked for thousands of years?"

;-)



Well, Norbert, do you have the source code to the application
that you're running?

If you do, why not put in a shadow column "search_name" ,
capitalise everything in it, and when users put in
an umlaut, convert it to AE before inserting?

Then, when searching, if the user inputs an umlaut, change
the search string accordingly and search against search_name.

I did this once with Irish names -

NAME SEARCH_NAME
McGuire MACGUIRE
MCGuire MACGUIRE
MacGuire MACGUIRE
MACGuire MACGUIRE
MaCGuire MACGUIRE
MaGuire MACGUIRE


Helps eliminate different/mistaken spellings.

Help?


Paul...






Quote:
Norbert




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.