none wrote:
Quote:
Hello there,
I have a table with many text and varchar fields and I would like to get
all distinct words from these filelds.
For example:
table Pet:
id int(10) unsigned
legende text
notes varchar(255)
#id #legende #notes
1 mysql is very very good' is it true
2 just mysql test
I would like to get a list (sorted if possible) like:
#word #occurence number #id
good 1 1
is 1 1
just 1 2
mysql 2 1,2
test 2 1
very 2 1 |
There's no SQL function in MySQL to split a string into multiple strings
on whitespace (and no means to loop over the resulting substrings
anyway). So you're going to have to do that part of it in application code.
For instance, you could have tables like this:
CREATE TABLE WORD (
WORD_ID INT(10) UNSIGNED,
WORD VARCHAR(255) NOT NULL
);
CREATE TABLE W2P (
WORD_ID INT(10) UNSIGNED REFERENCES WORD(WORD_ID),
PET_ID INT(10) UNSIGNED REFERENCES PET(ID),
PRIMARY KEY (WORD_ID,PET_ID)
);
Then fetch all records from Pet, loop over the result set, split the
strings into words, and for each word do the following:
INSERT IGNORE INTO WORD (WORD) VALUES(?)
supply the word as the parameter
wid = SELECT WORD_ID FROM WORD WHERE WORD = ?
pid = the id from the current record in the loop
INSERT IGNORE INTO W2P (WORD_ID, PET_ID) VALUES (wid, pid)
Once you finish the loop, those tables should be completely populated.
Then you can use the following query to get the output that you described:
SELECT W.WORD, COUNT(W.WORD) AS occurrence_number,
GROUP_CONCAT(P.ID) AS id
FROM WORD AS W INNER JOIN W2P ON W.WORD_ID = W2P.WORD_ID
INNER JOIN Pet AS P ON W2P.PET_ID = P.ID
GROUP BY W.WORD_ID
The GROUP_CONCAT() function was added in MySQL 4.1.
I'll leave it to you to enhance this to account for multiple instances
of a word within one row of the Pet table.
Regards,
Bill K.