dbTalk Databases Forums  

index on multiple word varchar column

comp.databases.postgresql comp.databases.postgresql


Discuss index on multiple word varchar column in the comp.databases.postgresql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Luis P. Mendes
 
Posts: n/a

Default index on multiple word varchar column - 11-28-2006 , 01:42 PM






Hi,

My problem is that in one of my tables I have a column for 'name'. It can
have from two to ten or mores words in it, separated by spaces. It's a
varchar(200) field.

What I want to have is a quick search on this column available to the
clients.

I've read the documentation. I found that btree indexes would be very
efficient if I was to search for the first word of 'name'.

But, I want all words in 'name' to be searchable and fast. I could not
find any answer to this.

If 'name' had a fixes number of words in it, I could separate the varios
words into different columns like 'name1', 'name2', etc. But this is not
the case.

The search should be case insensitive.

I've created a standard btree index but I don't know if it is of any use.

Is there any solution for this? I guess there is!

Maybe, I need a function to grab all words from all records and build an
index with it. But how?

I'd apreciate any help on this.


Luis P. Mendes



Reply With Quote
  #2  
Old   
Ch Lamprecht
 
Posts: n/a

Default Re: index on multiple word varchar column - 11-28-2006 , 03:11 PM






Luis P. Mendes wrote:
Quote:
Hi,

My problem is that in one of my tables I have a column for 'name'. It can
have from two to ten or mores words in it, separated by spaces. It's a
varchar(200) field.

What I want to have is a quick search on this column available to the
clients.

But, I want all words in 'name' to be searchable and fast. I could not
find any answer to this.

If 'name' had a fixes number of words in it, I could separate the varios
words into different columns like 'name1', 'name2', etc. But this is not
the case.

The search should be case insensitive.

I've created a standard btree index but I don't know if it is of any use.

Is there any solution for this? I guess there is!
Hi,
are you sure, that this column is in the right place?
To me it sounds like you should have a second table containing the (indexed)
"name" column and a foreign key referencing your first tables PK.

Christoph


Reply With Quote
  #3  
Old   
Luis P. Mendes
 
Posts: n/a

Default Re: index on multiple word varchar column - 11-28-2006 , 05:59 PM




Quote:
Hi,
are you sure, that this column is in the right place?
To me it sounds like you should have a second table containing the (indexed)
"name" column and a foreign key referencing your first tables PK.

Christoph
I don't see any advantage in what you're proposing.

Each record in the table has for the 'name' field something like:
'Luis Mendes Cristoph Mark David Smith'
'Michael Bush'
'Mark Steve Browne'

I just want a user to be able to search for 'Mark' and get results from
the first and third lines. It should not matter where 'Mark' is placed.

Luis




Reply With Quote
  #4  
Old   
HansH
 
Posts: n/a

Default Re: index on multiple word varchar column - 11-28-2006 , 06:49 PM



"Luis P. Mendes" <lupe (AT) localhost (DOT) localdomain> schreef in bericht
newsan.2006.11.28.23.59.34.362347 (AT) localhost (DOT) localdomain...
Quote:
are you sure, that this column is in the right place?
To me it sounds like you should have a second table containing the
(indexed)
"name" column and a foreign key referencing your first tables PK.

Christoph
I don't see any advantage in what you're proposing.

Each record in the table has for the 'name' field something like:
'Luis Mendes Cristoph Mark David Smith'
'Michael Bush'
'Mark Steve Browne'

I just want a user to be able to search for 'Mark' and get results from
the first and third lines. It should not matter where 'Mark' is placed.

It might become a matter of [lacking] performance
http://www.postgresql.org/docs/faqs.FAQ.html item 4.6
"When using wild-card operators such as LIKE or ~, indexes can only be used
in certain circumstances:
- The beginning of the search string must be anchored to the start of the
string, i.e. LIKE patterns must not start with %. ~ (regular expression)
patterns must start with ^.
- The search string can not start with a character class, e.g. [a-e].
- Case-insensitive searches such as ILIKE and ~* do not utilize indexes.
Instead, use expression indexes, which are described in section
http://www.postgresql.org/docs/faqs.FAQ.html#item4.8.
- The default C locale must be used during initdb because it is not possible
to know the next-greatest character in a non-C locale. You can create a
special text_pattern_ops index for such cases that work only for LIKE
indexing. "

Thus
select * from table where ' '||name||' 'ilike '% John %'
might be the easy but slow way out.

Slowness due to sequential scanning of the table may increase by the number
and size of the other fields.
A splitup may mitigate this increase to the cost of a more complex query
select * from table1 join table2 using(pk) where ' '||table2.name||' '
ilike '% John %'

You may consider to have table1 inherit the fields -pk and name- of table2
to keep current inserts, updates and deletes working.
Only the select query changes to ' select table1.* from table1 join table2
using(pk) where ' '||table2.name||' ' ilike '% John %' '

Need to benchmark some methods to choose one to fit your needs best.


HansH










Reply With Quote
  #5  
Old   
Karen Hill
 
Posts: n/a

Default Re: index on multiple word varchar column - 11-29-2006 , 03:29 PM




Luis P. Mendes wrote:
Quote:
Hi,

My problem is that in one of my tables I have a column for 'name'. It can
have from two to ten or mores words in it, separated by spaces. It's a
varchar(200) field.

What I want to have is a quick search on this column available to the
clients.

I've read the documentation. I found that btree indexes would be very
efficient if I was to search for the first word of 'name'.

But, I want all words in 'name' to be searchable and fast. I could not
find any answer to this.

If 'name' had a fixes number of words in it, I could separate the varios
words into different columns like 'name1', 'name2', etc. But this is not
the case.

The search should be case insensitive.

I've created a standard btree index but I don't know if it is of any use.

Is there any solution for this? I guess there is!

Maybe, I need a function to grab all words from all records and build an
index with it. But how?

I'd apreciate any help on this.

Hi Luis,

Try looking into tsearch2 in the contrib folder. It is a module that
facilitates full-text indexing which is what you are trying to do.
When 8.2 is released, it will be in the core postgresql IIRC.

http://www.sai.msu.su/~megera/postgr...-V2-intro.html



Reply With Quote
  #6  
Old   
Ch Lamprecht
 
Posts: n/a

Default Re: index on multiple word varchar column - 11-30-2006 , 02:15 AM



Luis P. Mendes schrieb:
Quote:
Hi,
are you sure, that this column is in the right place?
To me it sounds like you should have a second table containing the (indexed)
"name" column and a foreign key referencing your first tables PK.

Christoph

I don't see any advantage in what you're proposing.

The name column in a second table would contain one name for each row, so it
could be easily indexed.
But you probably have reasons for your table layout...

Quote:
Each record in the table has for the 'name' field something like:
'Luis Mendes Cristoph Mark David Smith'
'Michael Bush'
'Mark Steve Browne'

I just want a user to be able to search for 'Mark' and get results from
the first and third lines. It should not matter where 'Mark' is placed.

Luis



Reply With Quote
  #7  
Old   
Luis P. Mendes
 
Posts: n/a

Default Re: index on multiple word varchar column - 11-30-2006 , 12:48 PM



Quote:
Hi Luis,

Try looking into tsearch2 in the contrib folder. It is a module that
facilitates full-text indexing which is what you are trying to do.
When 8.2 is released, it will be in the core postgresql IIRC.

http://www.sai.msu.su/~megera/postgr...-V2-intro.html
Thank you all for your help.

It seems that tsearch2 is what I was looking for.

Luis P. Mendes


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.