dbTalk Databases Forums  

Fuzzy string search?

comp.databases.oracle.misc comp.databases.oracle.misc


Discuss Fuzzy string search? in the comp.databases.oracle.misc forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Marten Lehmann
 
Posts: n/a

Default Fuzzy string search? - 01-08-2008 , 08:13 AM






Hello,

in our application it would be very helpful, if the search wouldn't
return only exact matches, but also similar items, e.g.

search for "models" returns also "model"
search for "exmaple" returns also "example"
search for "dl365" returns also "dl385"

How can I do this with Oracle? Is there any extension I have to install?
Or is there a function like the function to match regular expressions?

Regards
Marten

Reply With Quote
  #2  
Old   
Vladimir M. Zakharychev
 
Posts: n/a

Default Re: Fuzzy string search? - 01-08-2008 , 10:33 AM






On Jan 8, 5:13 pm, Marten Lehmann <lehmannmap... (AT) cnm (DOT) de> wrote:
Quote:
Hello,

in our application it would be very helpful, if the search wouldn't
return only exact matches, but also similar items, e.g.

search for "models" returns also "model"
search for "exmaple" returns also "example"
search for "dl365" returns also "dl385"

How can I do this with Oracle? Is there any extension I have to install?
Or is there a function like the function to match regular expressions?

Regards
Marten
1. Basic soundex function (works only with English):

SQL> select soundex('example'), soundex('exmaple') from dual;

SOUN SOUN
---- ----
E251 E251

However, soundex will, for example, distinguish 'model' and 'models',
so it's not universal.

2. Oracle Text indexes (supports several European languages plus
Japanese and OCR):

SQL> create table texts (id number(10) primary key, content
varchar2(4000 byte));

Table created.

SQL> create index ix$ctx#texts on texts(content) indextype is
ctxsys.context;

Index created.

SQL> insert into texts values(1, 'model');

1 row created.

SQL> insert into texts values(2, 'example');

1 row created.

SQL> insert into texts values(3, 'dl385');

1 row created.

SQL> commit;

Commit complete.

SQL> exec ctxsys.ctx_ddl.sync_index('IX$CTX#TEXTS');

PL/SQL procedure successfully completed.

SQL> select id from texts where contains(content,'fuzzy(models)') > 0;


ID
----------

1

SQL> select id from texts where contains(content,'fuzzy(exmaple)') >
0;


ID
----------

2

SQL> select id from texts where contains(content,'?dl365') > 0;


ID
----------

3


'?keyword' is a shortcut to 'fuzzy(keyword)' (fuzzy() can also accept
4 arguments for weighted matching.) See Oracle Text Reference for your
Oracle release for more information on this powerful technology.
Usually it's installed by default. In 10g and later, there's pretty
little effort needed to maintain Text indexes with "sync on commit"
option and scheduled online index rebuilds for volatile data.

Hth,
Vladimir M. Zakharychev
N-Networks, makers of Dynamic PSP(tm)
http://www.dynamicpsp.com



Reply With Quote
  #3  
Old   
Vladimir M. Zakharychev
 
Posts: n/a

Default Re: Fuzzy string search? - 01-08-2008 , 10:33 AM



On Jan 8, 5:13 pm, Marten Lehmann <lehmannmap... (AT) cnm (DOT) de> wrote:
Quote:
Hello,

in our application it would be very helpful, if the search wouldn't
return only exact matches, but also similar items, e.g.

search for "models" returns also "model"
search for "exmaple" returns also "example"
search for "dl365" returns also "dl385"

How can I do this with Oracle? Is there any extension I have to install?
Or is there a function like the function to match regular expressions?

Regards
Marten
1. Basic soundex function (works only with English):

SQL> select soundex('example'), soundex('exmaple') from dual;

SOUN SOUN
---- ----
E251 E251

However, soundex will, for example, distinguish 'model' and 'models',
so it's not universal.

2. Oracle Text indexes (supports several European languages plus
Japanese and OCR):

SQL> create table texts (id number(10) primary key, content
varchar2(4000 byte));

Table created.

SQL> create index ix$ctx#texts on texts(content) indextype is
ctxsys.context;

Index created.

SQL> insert into texts values(1, 'model');

1 row created.

SQL> insert into texts values(2, 'example');

1 row created.

SQL> insert into texts values(3, 'dl385');

1 row created.

SQL> commit;

Commit complete.

SQL> exec ctxsys.ctx_ddl.sync_index('IX$CTX#TEXTS');

PL/SQL procedure successfully completed.

SQL> select id from texts where contains(content,'fuzzy(models)') > 0;


ID
----------

1

SQL> select id from texts where contains(content,'fuzzy(exmaple)') >
0;


ID
----------

2

SQL> select id from texts where contains(content,'?dl365') > 0;


ID
----------

3


'?keyword' is a shortcut to 'fuzzy(keyword)' (fuzzy() can also accept
4 arguments for weighted matching.) See Oracle Text Reference for your
Oracle release for more information on this powerful technology.
Usually it's installed by default. In 10g and later, there's pretty
little effort needed to maintain Text indexes with "sync on commit"
option and scheduled online index rebuilds for volatile data.

Hth,
Vladimir M. Zakharychev
N-Networks, makers of Dynamic PSP(tm)
http://www.dynamicpsp.com



Reply With Quote
  #4  
Old   
Vladimir M. Zakharychev
 
Posts: n/a

Default Re: Fuzzy string search? - 01-08-2008 , 10:33 AM



On Jan 8, 5:13 pm, Marten Lehmann <lehmannmap... (AT) cnm (DOT) de> wrote:
Quote:
Hello,

in our application it would be very helpful, if the search wouldn't
return only exact matches, but also similar items, e.g.

search for "models" returns also "model"
search for "exmaple" returns also "example"
search for "dl365" returns also "dl385"

How can I do this with Oracle? Is there any extension I have to install?
Or is there a function like the function to match regular expressions?

Regards
Marten
1. Basic soundex function (works only with English):

SQL> select soundex('example'), soundex('exmaple') from dual;

SOUN SOUN
---- ----
E251 E251

However, soundex will, for example, distinguish 'model' and 'models',
so it's not universal.

2. Oracle Text indexes (supports several European languages plus
Japanese and OCR):

SQL> create table texts (id number(10) primary key, content
varchar2(4000 byte));

Table created.

SQL> create index ix$ctx#texts on texts(content) indextype is
ctxsys.context;

Index created.

SQL> insert into texts values(1, 'model');

1 row created.

SQL> insert into texts values(2, 'example');

1 row created.

SQL> insert into texts values(3, 'dl385');

1 row created.

SQL> commit;

Commit complete.

SQL> exec ctxsys.ctx_ddl.sync_index('IX$CTX#TEXTS');

PL/SQL procedure successfully completed.

SQL> select id from texts where contains(content,'fuzzy(models)') > 0;


ID
----------

1

SQL> select id from texts where contains(content,'fuzzy(exmaple)') >
0;


ID
----------

2

SQL> select id from texts where contains(content,'?dl365') > 0;


ID
----------

3


'?keyword' is a shortcut to 'fuzzy(keyword)' (fuzzy() can also accept
4 arguments for weighted matching.) See Oracle Text Reference for your
Oracle release for more information on this powerful technology.
Usually it's installed by default. In 10g and later, there's pretty
little effort needed to maintain Text indexes with "sync on commit"
option and scheduled online index rebuilds for volatile data.

Hth,
Vladimir M. Zakharychev
N-Networks, makers of Dynamic PSP(tm)
http://www.dynamicpsp.com



Reply With Quote
  #5  
Old   
Vladimir M. Zakharychev
 
Posts: n/a

Default Re: Fuzzy string search? - 01-08-2008 , 10:33 AM



On Jan 8, 5:13 pm, Marten Lehmann <lehmannmap... (AT) cnm (DOT) de> wrote:
Quote:
Hello,

in our application it would be very helpful, if the search wouldn't
return only exact matches, but also similar items, e.g.

search for "models" returns also "model"
search for "exmaple" returns also "example"
search for "dl365" returns also "dl385"

How can I do this with Oracle? Is there any extension I have to install?
Or is there a function like the function to match regular expressions?

Regards
Marten
1. Basic soundex function (works only with English):

SQL> select soundex('example'), soundex('exmaple') from dual;

SOUN SOUN
---- ----
E251 E251

However, soundex will, for example, distinguish 'model' and 'models',
so it's not universal.

2. Oracle Text indexes (supports several European languages plus
Japanese and OCR):

SQL> create table texts (id number(10) primary key, content
varchar2(4000 byte));

Table created.

SQL> create index ix$ctx#texts on texts(content) indextype is
ctxsys.context;

Index created.

SQL> insert into texts values(1, 'model');

1 row created.

SQL> insert into texts values(2, 'example');

1 row created.

SQL> insert into texts values(3, 'dl385');

1 row created.

SQL> commit;

Commit complete.

SQL> exec ctxsys.ctx_ddl.sync_index('IX$CTX#TEXTS');

PL/SQL procedure successfully completed.

SQL> select id from texts where contains(content,'fuzzy(models)') > 0;


ID
----------

1

SQL> select id from texts where contains(content,'fuzzy(exmaple)') >
0;


ID
----------

2

SQL> select id from texts where contains(content,'?dl365') > 0;


ID
----------

3


'?keyword' is a shortcut to 'fuzzy(keyword)' (fuzzy() can also accept
4 arguments for weighted matching.) See Oracle Text Reference for your
Oracle release for more information on this powerful technology.
Usually it's installed by default. In 10g and later, there's pretty
little effort needed to maintain Text indexes with "sync on commit"
option and scheduled online index rebuilds for volatile data.

Hth,
Vladimir M. Zakharychev
N-Networks, makers of Dynamic PSP(tm)
http://www.dynamicpsp.com



Reply With Quote
  #6  
Old   
Marten Lehmann
 
Posts: n/a

Default Re: Fuzzy string search? - 01-10-2008 , 02:23 PM



Hello,

Quote:
SQL> select id from texts where contains(content,'fuzzy(models)') > 0;


ID
----------

1
this sounds interesting. I will have a closer look at oracle text,
thanks for your help!

Regards
Marten


Reply With Quote
  #7  
Old   
Marten Lehmann
 
Posts: n/a

Default Re: Fuzzy string search? - 01-10-2008 , 02:23 PM



Hello,

Quote:
SQL> select id from texts where contains(content,'fuzzy(models)') > 0;


ID
----------

1
this sounds interesting. I will have a closer look at oracle text,
thanks for your help!

Regards
Marten


Reply With Quote
  #8  
Old   
Marten Lehmann
 
Posts: n/a

Default Re: Fuzzy string search? - 01-10-2008 , 02:23 PM



Hello,

Quote:
SQL> select id from texts where contains(content,'fuzzy(models)') > 0;


ID
----------

1
this sounds interesting. I will have a closer look at oracle text,
thanks for your help!

Regards
Marten


Reply With Quote
  #9  
Old   
Marten Lehmann
 
Posts: n/a

Default Re: Fuzzy string search? - 01-10-2008 , 02:23 PM



Hello,

Quote:
SQL> select id from texts where contains(content,'fuzzy(models)') > 0;


ID
----------

1
this sounds interesting. I will have a closer look at oracle text,
thanks for your help!

Regards
Marten


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.