dbTalk Databases Forums  

Search query, adding weight to exact result

comp.databases.mysql comp.databases.mysql


Discuss Search query, adding weight to exact result in the comp.databases.mysql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
jwcarlton
 
Posts: n/a

Default Search query, adding weight to exact result - 07-08-2011 , 10:07 PM






For a simple message board search engine, I'm using the following
query (in PHP, although I don't think the language matters):

$query = sprintf("
SELECT DISTINCT id, MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
AS occurrences
FROM forum_posts WHERE
MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
ORDER BY occurrences DESC",

$_GET['search'],
$_GET['search']);

(Note that the breaks are added here in an attempt to make that easier
for you to read.)

What I'm finding, though, is that I can have a thread with a lot of
posts that have the individual keywords (but not as an actual phrase)
that will have a higher result than a thread with less posts, but the
exact phrase.

For example, let's say that the search is for "Jason is cool". A post
that says "Jason went outside and said 'the weather today is pretty
cool'" is going to have the same weight as one that actually says
"Jason is cool", even though the second would be more likely to be
what the searcher is wanting.

So the question is, how do I modify the query to give a higher weight
to an exact result?

Reply With Quote
  #2  
Old   
Luuk
 
Posts: n/a

Default Re: Search query, adding weight to exact result - 07-09-2011 , 08:15 AM






On 09-07-2011 05:07, jwcarlton wrote:
Quote:
For a simple message board search engine, I'm using the following
query (in PHP, although I don't think the language matters):

$query = sprintf("
SELECT DISTINCT id, MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
AS occurrences
FROM forum_posts WHERE
MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
ORDER BY occurrences DESC",

$_GET['search'],
$_GET['search']);

(Note that the breaks are added here in an attempt to make that easier
for you to read.)

What I'm finding, though, is that I can have a thread with a lot of
posts that have the individual keywords (but not as an actual phrase)
that will have a higher result than a thread with less posts, but the
exact phrase.

For example, let's say that the search is for "Jason is cool". A post
that says "Jason went outside and said 'the weather today is pretty
cool'" is going to have the same weight as one that actually says
"Jason is cool", even though the second would be more likely to be
what the searcher is wanting.

So the question is, how do I modify the query to give a higher weight
to an exact result?
Your current search is:
'Jason is cool'

Try:
'>"Jason is cool" Jason is cool'

This should give 'Jason is cool' extra weight

see:
http://dev.mysql.com/doc/refman/5.5/...t-boolean.html

--
Luuk

Reply With Quote
  #3  
Old   
Álvaro G. Vicario
 
Posts: n/a

Default Re: Search query, adding weight to exact result - 07-11-2011 , 02:59 AM



El 09/07/2011 5:07, jwcarlton escribió/wrote:
Quote:
For a simple message board search engine, I'm using the following
query (in PHP, although I don't think the language matters):

$query = sprintf("
SELECT DISTINCT id, MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
AS occurrences
FROM forum_posts WHERE
MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
ORDER BY occurrences DESC",

$_GET['search'],
$_GET['search']);
*Sigh* I've already given this advice three or four times today and it's
only 10:00 a.m. here:

http://en.wikipedia.org/wiki/SQL_injection



Quote:
(Note that the breaks are added here in an attempt to make that easier
for you to read.)
Well, nothing prevents you from adding them in your real code and make
it easier to read *for you* as well


Sorry for not providing any actual help about the question subject.


--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--

Reply With Quote
  #4  
Old   
jwcarlton
 
Posts: n/a

Default Re: Search query, adding weight to exact result - 07-15-2011 , 01:43 AM



Quote:
For a simple message board search engine, I'm using the following
query (in PHP, although I don't think the language matters):

$query = sprintf("
SELECT DISTINCT id, MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
AS occurrences
FROM forum_posts WHERE
MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
ORDER BY occurrences DESC",

$_GET['search'],
$_GET['search']);

*Sigh* I've already given this advice three or four times today and it's
only 10:00 a.m. here:

http://en.wikipedia.org/wiki/SQL_injection
While I thank you for replying, I'm not sure how this page gives any
insight toward the question?


Quote:
Well, nothing prevents you from adding them in your real code and make
it easier to read *for you* as well
Sure, but I don't really have a problem reading it without breaks; in
fact, I find it a little easier without them. I just added breaks here
because some readers and Google Groups tend to wrap them in odd ways.

- J

Reply With Quote
  #5  
Old   
Peter H. Coffin
 
Posts: n/a

Default Re: Search query, adding weight to exact result - 07-15-2011 , 07:09 AM



On Thu, 14 Jul 2011 23:43:28 -0700 (PDT), jwcarlton wrote:
Quote:
Well, nothing prevents you from adding them in your real code and make
it easier to read *for you* as well

Sure, but I don't really have a problem reading it without breaks; in
fact, I find it a little easier without them. I just added breaks here
because some readers and Google Groups tend to wrap them in odd ways.
Then may we also humbly suggest that you get a real newsreader? There
are several good ones out there, and they all pretty much work the same.
That is, they'll edit, post, and display with fixed-width fonts and hard
line wrapping, as the good lord intended man to.

--
We're the technical experts. We were hired so that management could
ignore our recommendations and tell us how to do our jobs.
-- Mike Andrews

Reply With Quote
  #6  
Old   
Álvaro G. Vicario
 
Posts: n/a

Default Re: Search query, adding weight to exact result - 07-15-2011 , 07:41 AM



El 15/07/2011 8:43, jwcarlton escribió/wrote:
Quote:
For a simple message board search engine, I'm using the following
query (in PHP, although I don't think the language matters):

$query = sprintf("
SELECT DISTINCT id, MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
AS occurrences
FROM forum_posts WHERE
MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
ORDER BY occurrences DESC",

$_GET['search'],
$_GET['search']);

*Sigh* I've already given this advice three or four times today and it's
only 10:00 a.m. here:

http://en.wikipedia.org/wiki/SQL_injection

While I thank you for replying, I'm not sure how this page gives any
insight toward the question?
About SQL injection or about your original question?

SQL-injection-unaware coding is a technique used in corporate sites like
MySQL or Sony in order to handle their customer data and credit card
numbers to on-line criminals. (I already apologised for failing to
answer the latter.)


Quote:
Well, nothing prevents you from adding them in your real code and make
it easier to read *for you* as well

Sure, but I don't really have a problem reading it without breaks; in
fact, I find it a little easier without them.
You do? Then you either code with "Wrap lines" enabled or make a really
heavy use of the horizontal scroll bar ;-P


--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--

Reply With Quote
  #7  
Old   
onedbguru
 
Posts: n/a

Default Re: Search query, adding weight to exact result - 07-15-2011 , 06:42 PM



On Jul 9, 9:15*am, Luuk <L... (AT) invalid (DOT) lan> wrote:
Quote:
On 09-07-2011 05:07, jwcarlton wrote:









For a simple message board search engine, I'm using the following
query (in PHP, although I don't think the language matters):

$query = sprintf("
SELECT DISTINCT id, MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
AS occurrences
FROM forum_posts WHERE
MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
ORDER BY occurrences DESC",

$_GET['search'],
$_GET['search']);

(Note that the breaks are added here in an attempt to make that easier
for you to read.)

What I'm finding, though, is that I can have a thread with a lot of
posts that have the individual keywords (but not as an actual phrase)
that will have a higher result than a thread with less posts, but the
exact phrase.

For example, let's say that the search is for "Jason is cool". A post
that says "Jason went outside and said 'the weather today is pretty
cool'" is going to have the same weight as one that actually says
"Jason is cool", even though the second would be more likely to be
what the searcher is wanting.

So the question is, how do I modify the query to give a higher weight
to an exact result?

Your current search is:
'Jason is cool'

Try:
'>"Jason is cool" Jason is cool'

This should give 'Jason is cool' extra weight

see:http://dev.mysql.com/doc/refman/5.5/...t-boolean.html

--
Luuk
why would you not use
:
'>"Jason is cool" +Jason +is +cool'

As I read the docs, This says use give extra weight to "Jason is cool"
searching for 'Jason AND is AND cool' although "is" is a noise word
that would get discarded anyway. Using 'Jason is cool' by itself
infers search for 'Jason OR is OR cool' with no specific weight or
ranking.

From the docs
"Note
In implementing this feature, MySQL uses what is sometimes referred to
as implied Boolean logic, in which
+ stands for AND
- stands for NOT
[no operator] implies OR
"
How you implement that in your search feature is going to be fun...

Reply With Quote
  #8  
Old   
Luuk
 
Posts: n/a

Default Re: Search query, adding weight to exact result - 07-16-2011 , 02:29 AM



On 16-07-2011 01:42, onedbguru wrote:
Quote:
On Jul 9, 9:15 am, Luuk <L... (AT) invalid (DOT) lan> wrote:
On 09-07-2011 05:07, jwcarlton wrote:









For a simple message board search engine, I'm using the following
query (in PHP, although I don't think the language matters):

$query = sprintf("
SELECT DISTINCT id, MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
AS occurrences
FROM forum_posts WHERE
MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
ORDER BY occurrences DESC",

$_GET['search'],
$_GET['search']);

(Note that the breaks are added here in an attempt to make that easier
for you to read.)

What I'm finding, though, is that I can have a thread with a lot of
posts that have the individual keywords (but not as an actual phrase)
that will have a higher result than a thread with less posts, but the
exact phrase.

For example, let's say that the search is for "Jason is cool". A post
that says "Jason went outside and said 'the weather today is pretty
cool'" is going to have the same weight as one that actually says
"Jason is cool", even though the second would be more likely to be
what the searcher is wanting.

So the question is, how do I modify the query to give a higher weight
to an exact result?

Your current search is:
'Jason is cool'

Try:
'>"Jason is cool" Jason is cool'

This should give 'Jason is cool' extra weight

see:http://dev.mysql.com/doc/refman/5.5/...t-boolean.html

--
Luuk

why would you not use
:
'>"Jason is cool" +Jason +is +cool'

As I read the docs, This says use give extra weight to "Jason is cool"
searching for 'Jason AND is AND cool' although "is" is a noise word
that would get discarded anyway. Using 'Jason is cool' by itself
infers search for 'Jason OR is OR cool' with no specific weight or
ranking.

From the docs
"Note
In implementing this feature, MySQL uses what is sometimes referred to
as implied Boolean logic, in which
+ stands for AND
- stands for NOT
[no operator] implies OR
"
How you implement that in your search feature is going to be fun...








i did a simple test, and the '+' does not look like you want to have (in
this case)

mysql (test)> select i,t, match(t) AGAINST('>"aaaaaaaa bbbbbbbb"
aaaaaaaa bbbbbbbb' IN BOOLEAN MODE) f from search where match(t)
AGAINST('>"aaaaaaaa bbbbbbbb" aaaaaaaa bbbbbbbb' IN BOOLEAN MODE);
+----+-------------------+------+
Quote:
i | t | f |
+----+-------------------+------+
6 | aaaaaaaa | 1 |
7 | bbbbbbbb | 1 |
8 | aaaaaaaa bbbbbbbb | 3.5 |
9 | cccccccc bbbbbbbb | 1 |
10 | cccccccc aaaaaaaa | 1 |
+----+-------------------+------+
5 rows in set (0.00 sec)

mysql (test)> select i,t, match(t) AGAINST('>"aaaaaaaa bbbbbbbb"
+aaaaaaaa bbbbbbbb' IN BOOLEAN MODE) f from search where match(t)
AGAINST('>"aaaaaaaa bbbbbbbb" +aaaaaaaa bbbbbbbb' IN BOOLEAN MODE);
+----+-------------------+-----------------+
Quote:
i | t | f |
+----+-------------------+-----------------+
6 | aaaaaaaa | 1 |
8 | aaaaaaaa bbbbbbbb | 1.8333333730698 |
10 | cccccccc aaaaaaaa | 1 |
+----+-------------------+-----------------+
3 rows in set (0.00 sec)

mysql (test)> select i,t, match(t) AGAINST('>"aaaaaaaa bbbbbbbb"
+aaaaaaaa +bbbbbbbb' IN BOOLEAN MODE) f from search where match(t)
AGAINST('>"aaaaaaaa bbbbbbbb" +aaaaaaaa +bbbbbbbb' IN BOOLEAN MODE);
+---+-------------------+------+
Quote:
i | t | f |
+---+-------------------+------+
8 | aaaaaaaa bbbbbbbb | 1.5 |
+---+-------------------+------+
1 row in set (0.00 sec)

mysql (test)>


--
Luuk

Reply With Quote
  #9  
Old   
onedbguru
 
Posts: n/a

Default Re: Search query, adding weight to exact result - 07-17-2011 , 01:21 PM



On Jul 16, 3:29*am, Luuk <L... (AT) invalid (DOT) lan> wrote:
Quote:
On 16-07-2011 01:42, onedbguru wrote:









On Jul 9, 9:15 am, Luuk <L... (AT) invalid (DOT) lan> wrote:
On 09-07-2011 05:07, jwcarlton wrote:

For a simple message board search engine, I'm using the following
query (in PHP, although I don't think the language matters):

$query = sprintf("
SELECT DISTINCT id, MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
AS occurrences
FROM forum_posts WHERE
MATCH(subject, username, comment)
AGAINST ('%s' IN BOOLEAN MODE)
ORDER BY occurrences DESC",

$_GET['search'],
$_GET['search']);

(Note that the breaks are added here in an attempt to make that easier
for you to read.)

What I'm finding, though, is that I can have a thread with a lot of
posts that have the individual keywords (but not as an actual phrase)
that will have a higher result than a thread with less posts, but the
exact phrase.

For example, let's say that the search is for "Jason is cool". A post
that says "Jason went outside and said 'the weather today is pretty
cool'" is going to have the same weight as one that actually says
"Jason is cool", even though the second would be more likely to be
what the searcher is wanting.

So the question is, how do I modify the query to give a higher weight
to an exact result?

Your current search is:
'Jason is cool'

Try:
'>"Jason is cool" Jason is cool'

This should give 'Jason is cool' extra weight

see:http://dev.mysql.com/doc/refman/5.5/...t-boolean.html

--
Luuk

why would you not use
:
'>"Jason is cool" +Jason +is +cool'

As I read the docs, This says use give extra weight to "Jason is cool"
searching for 'Jason AND is AND cool' *although "is" is a noise word
that would get discarded anyway. Using 'Jason is cool' by itself
infers search for 'Jason OR is OR cool' with no specific weight or
ranking.

From the docs
"Note
In implementing this feature, MySQL uses what is sometimes referred to
as implied Boolean logic, in which
+ stands for AND
- stands for NOT
[no operator] implies OR
"
How you implement that in your search feature is going to be fun...

i did a simple test, and the '+' does not look like you want to have (in
this case)

mysql (test)> select i,t, match(t) AGAINST('>"aaaaaaaa bbbbbbbb"
aaaaaaaa bbbbbbbb' IN BOOLEAN MODE) f from search where match(t)
AGAINST('>"aaaaaaaa bbbbbbbb" aaaaaaaa bbbbbbbb' IN BOOLEAN MODE);
+----+-------------------+------+
| i *| t * * * * * * * * | f * *|
+----+-------------------+------+
| *6 | aaaaaaaa * * * * *| * *1 |
| *7 | bbbbbbbb * * * * *| * *1 |
| *8 | aaaaaaaa bbbbbbbb | *3.5 |
| *9 | cccccccc bbbbbbbb | * *1 |
| 10 | cccccccc aaaaaaaa | * *1 |
+----+-------------------+------+
5 rows in set (0.00 sec)

mysql (test)> select i,t, match(t) AGAINST('>"aaaaaaaa bbbbbbbb"
+aaaaaaaa bbbbbbbb' IN BOOLEAN MODE) f from search where match(t)
AGAINST('>"aaaaaaaa bbbbbbbb" +aaaaaaaa bbbbbbbb' IN BOOLEAN MODE);
+----+-------------------+-----------------+
| i *| t * * * * * * * * | f * * * * * * * |
+----+-------------------+-----------------+
| *6 | aaaaaaaa * * * * *| * * * * * * * 1 |
| *8 | aaaaaaaa bbbbbbbb | 1.8333333730698 |
| 10 | cccccccc aaaaaaaa | * * * * * * * 1 |
+----+-------------------+-----------------+
3 rows in set (0.00 sec)

mysql (test)> select i,t, match(t) AGAINST('>"aaaaaaaa bbbbbbbb"
+aaaaaaaa +bbbbbbbb' IN BOOLEAN MODE) f from search where match(t)
AGAINST('>"aaaaaaaa bbbbbbbb" +aaaaaaaa +bbbbbbbb' IN BOOLEAN MODE);
+---+-------------------+------+
| i | t * * * * * * * * | f * *|
+---+-------------------+------+
| 8 | aaaaaaaa bbbbbbbb | *1.5 |
+---+-------------------+------+
1 row in set (0.00 sec)

mysql (test)

--
Luuk
I dunno - it might be EXACTLY what he is looking for
From the OP: " says "Jason is cool", even though the second ( "Jason
is cool") would be more likely to be
what the searcher is wanting."

Developing a search feature is always interesting, because you have to
make some assumptions and program according to those assumptions and
database engine defaults.

I would say that the typical "search" is going to be an "AND" search -
not an OR search. Mysql defaults to "OR". Just my 2cents worth. If I
am looking for "Jason is cool", I do not want to see a post on how the
coolness of the weather - even though it finds one of the words I am
looking for. Have you seen the 'BING' commercials on search overload?

Reply With Quote
  #10  
Old   
Luuk
 
Posts: n/a

Default BING [was:Re: Search query, adding weight to exact result] - 07-17-2011 , 01:42 PM



On 17-07-2011 20:21, onedbguru wrote:
Quote:
Have you seen the 'BING' commercials on search overload?
BING?
what is that?

O wacht,
dat is die site die probeert om die andere zoek-site na te doen:
http://www.bing.com/?setmkt=nl-NL

;(

--
Luuk

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.