dbTalk Databases Forums  

MATCH ... AGAINST not working as expected

comp.databases.mysql comp.databases.mysql


Discuss MATCH ... AGAINST not working as expected in the comp.databases.mysql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
EastSideDev
 
Posts: n/a

Default MATCH ... AGAINST not working as expected - 01-15-2011 , 06:19 PM






Per the MySQL online docs, the statement below:

SELECT * FROM `table` WHERE MATCH (Description) AGAINST ('+party +fun
+town' IN BOOLEAN MODE)

is supposed to return rows where ALL THREE words are present. Instead
it is returning all rows where all three words are present as well as
rows where 2 out of the three words are present.

Is my understanding of how this is supposed to work correct?

Reply With Quote
  #2  
Old   
Álvaro G. Vicario
 
Posts: n/a

Default Re: MATCH ... AGAINST not working as expected - 01-17-2011 , 02:21 AM






El 16/01/2011 1:19, EastSideDev escribió/wrote:
Quote:
Per the MySQL online docs, the statement below:

SELECT * FROM `table` WHERE MATCH (Description) AGAINST ('+party +fun
+town' IN BOOLEAN MODE)

is supposed to return rows where ALL THREE words are present. Instead
it is returning all rows where all three words are present as well as
rows where 2 out of the three words are present.

Is my understanding of how this is supposed to work correct?
"fun" is probably being ignored because of its 3 char length:

http://dev.mysql.com/doc/refman/5.0/...-language.html

«Some words are ignored in full-text searches:

* Any word that is too short is ignored. The default minimum length of
words that are found by full-text searches is four characters.

*Words in the stopword list are ignored. A stopword is a word such as
“the” or “some” that is so common that it is considered to have zero
semantic value. There is a built-in stopword list, but it can be
overwritten by a user-defined list.»




--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--

Reply With Quote
  #3  
Old   
Erick T. Barkhuis
 
Posts: n/a

Default Re: MATCH ... AGAINST not working as expected - 01-17-2011 , 02:27 AM



"Álvaro G. Vicario":

Quote:
El 16/01/2011 1:19, EastSideDev escribió/wrote:

SELECT * FROM `table` WHERE MATCH (Description) AGAINST ('+party
+fun +town' IN BOOLEAN MODE)

"fun" is probably being ignored because of its 3 char length:
* Any word that is too short is ignored. The default minimum length
of words that are found by full-text searches is four characters.

*Words in the stopword list are ignored. A stopword is a word such as
“the” or “some” that is so common that ...»

If "the" is ignored because it has only three characters, why is it on
the default stopword list?


--
Erick

Reply With Quote
  #4  
Old   
Luuk
 
Posts: n/a

Default Re: MATCH ... AGAINST not working as expected - 01-17-2011 , 02:36 AM



On 17-01-11 09:27, Erick T. Barkhuis wrote:
Quote:
"Álvaro G. Vicario":

El 16/01/2011 1:19, EastSideDev escribió/wrote:

SELECT * FROM `table` WHERE MATCH (Description) AGAINST ('+party
+fun +town' IN BOOLEAN MODE)

"fun" is probably being ignored because of its 3 char length:
* Any word that is too short is ignored. The default minimum length
of words that are found by full-text searches is four characters.

*Words in the stopword list are ignored. A stopword is a word such as
“the” or “some” that is so common that ...»


If "the" is ignored because it has only three characters, why is it on
the default stopword list?


Because the minimum word length can be changed, from the default value 4
to a shorter value...

http://dev.mysql.com/doc/refman/5.0/...t_min_word_len

--
Luuk

Reply With Quote
  #5  
Old   
Erick T. Barkhuis
 
Posts: n/a

Default Re: MATCH ... AGAINST not working as expected - 01-17-2011 , 02:44 AM



Luuk:

Quote:
On 17-01-11 09:27, Erick T. Barkhuis wrote:
"Álvaro G. Vicario":

El 16/01/2011 1:19, EastSideDev escribió/wrote:

SELECT * FROM `table` WHERE MATCH (Description) AGAINST
('+party +fun +town' IN BOOLEAN MODE)

"fun" is probably being ignored because of its 3 char length:
* Any word that is too short is ignored. The default minimum
length of words that are found by full-text searches is four
characters.

*Words in the stopword list are ignored. A stopword is a word
such as “the” or “some” that is so common that ...»


If "the" is ignored because it has only three characters, why is
it on the default stopword list?


Because the minimum word length can be changed, from the default
value 4 to a shorter value...

http://dev.mysql.com/doc/refman/5.0/...t_min_word_len

Yes, but so can the stopword list. So, _if_ someone is smart enough to
reduce the default word length, he should include all
three-letter-words that he wants to ignore into the stopword list at
that moment.

I was just wondering: if MySQL has defaults for the minimum word length
and the stopword list, why would these defaults contain redundant
values?
Defaults can be changed, and the user is responsible for correct values
in both. But the _default_ needs not to contain redundant values, does
it?


--
Erick

Reply With Quote
  #6  
Old   
Luuk
 
Posts: n/a

Default Re: MATCH ... AGAINST not working as expected - 01-17-2011 , 02:55 AM



On 17-01-11 09:44, Erick T. Barkhuis wrote:
Quote:
Luuk:

On 17-01-11 09:27, Erick T. Barkhuis wrote:
"Álvaro G. Vicario":

El 16/01/2011 1:19, EastSideDev escribió/wrote:

SELECT * FROM `table` WHERE MATCH (Description) AGAINST
('+party +fun +town' IN BOOLEAN MODE)

"fun" is probably being ignored because of its 3 char length:
* Any word that is too short is ignored. The default minimum
length of words that are found by full-text searches is four
characters.

*Words in the stopword list are ignored. A stopword is a word
such as “the” or “some” that is so common that ...»


If "the" is ignored because it has only three characters, why is
it on the default stopword list?


Because the minimum word length can be changed, from the default
value 4 to a shorter value...

http://dev.mysql.com/doc/refman/5.0/...t_min_word_len


Yes, but so can the stopword list. So, _if_ someone is smart enough to
reduce the default word length, he should include all
three-letter-words that he wants to ignore into the stopword list at
that moment.

I was just wondering: if MySQL has defaults for the minimum word length
and the stopword list, why would these defaults contain redundant
values?
Defaults can be changed, and the user is responsible for correct values
in both. But the _default_ needs not to contain redundant values, does
it?


Indeed, it does not need to contain them,

But it also includes i.e. 'wherein', and that word is never used by me,
so its redundant to me as well....

http://dev.mysql.com/doc/refman/5.0/...stopwords.html

--
Luuk

Reply With Quote
  #7  
Old   
Álvaro G. Vicario
 
Posts: n/a

Default Re: MATCH ... AGAINST not working as expected - 01-17-2011 , 04:21 AM



El 17/01/2011 9:27, Erick T. Barkhuis escribió/wrote:
Quote:
"Álvaro G. Vicario":

El 16/01/2011 1:19, EastSideDev escribió/wrote:

SELECT * FROM `table` WHERE MATCH (Description) AGAINST ('+party
+fun +town' IN BOOLEAN MODE)

"fun" is probably being ignored because of its 3 char length:
* Any word that is too short is ignored. The default minimum length
of words that are found by full-text searches is four characters.

*Words in the stopword list are ignored. A stopword is a word such as
“the” or “some” that is so common that ...»


If "the" is ignored because it has only three characters, why is it on
the default stopword list?
The OP's example illustrates this just fine: he'll probably consider to
lower the character length so he's able to find "fun", but I don't think
he's specially interested in "the".

They're just similar concepts that MySQL allows to be handled separately.


--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--

Reply With Quote
  #8  
Old   
Erick T. Barkhuis
 
Posts: n/a

Default Re: MATCH ... AGAINST not working as expected - 01-17-2011 , 04:29 AM



"Álvaro G. Vicario":

Quote:
El 17/01/2011 9:27, Erick T. Barkhuis escribió/wrote:

If "the" is ignored because it has only three characters, why is it
on the default stopword list?

The OP's example illustrates this just fine: he'll probably consider
to lower the character length so he's able to find "fun", but I don't
think he's specially interested in "the".
So, he doesn't include "the" in his search phrase.

Quote:
They're just similar concepts that MySQL allows to be handled
separately.
In a way, I understand this.
Yet, if someone isn't interested in stopwords, why would [s]he include
them in the query?

%puzzled%

--
Erick

Reply With Quote
  #9  
Old   
Captain Paralytic
 
Posts: n/a

Default Re: MATCH ... AGAINST not working as expected - 01-17-2011 , 05:31 AM



On Jan 17, 10:29*am, "Erick T. Barkhuis" <erick.use-... (AT) ardane (DOT) c.o.m>
wrote:
Quote:
In a way, I understand this.
Yet, if someone isn't interested in stopwords, why would [s]he include
them in the query?

%puzzled%
Because generally the words in the query are supplied by a web site
user typing into a search box and the decision of what words should be
searchable are the programmer's.

Reply With Quote
  #10  
Old   
Erick T. Barkhuis
 
Posts: n/a

Default Re: MATCH ... AGAINST not working as expected - 01-17-2011 , 05:36 AM



Captain Paralytic:

Quote:
On Jan 17, 10:29*am, "Erick T. Barkhuis" <erick.use-... (AT) ardane (DOT) c.o.m
wrote:
In a way, I understand this.
Yet, if someone isn't interested in stopwords, why would [s]he
include them in the query?

%puzzled%
Because generally the words in the query are supplied by a web site
user typing into a search box and the decision of what words should be
searchable are the programmer's.
That will be the reason, yes.

(Still strange: if I were a user, typing in "some" or "whereas" -
knowing that those words are in the text somewhere - I would frown if
the application said that these words are not included, assuming they
are in the stopwords list.

But OK, at some point, somewhere, this feature may be useful.)

--
Erick

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.