![]() | |
#1
| |||
| |||
|
|
FIND_IN_SET('c','a,b,c,d') | +----------------------------+ 3 | +----------------------------+ |
|
GETELEMENT_AT_INDEX_IN_SET(3.'a,b,c,d');| +-------------------------------------------------------+ c | +----------------------------+ |
|
complicated | +-------------+ c | +-------------+ |
#2
| |||
| |||
|
|
Hi all, I'm afraid I need some help with a string manipulation function in mysql. Suppose I have a column which contains a comma-separated set of unique strings e.g. a,aab,b,bc,de c,de,dd,dfr xo,ff,fr I would like to retrieve the nth element of a set in the same simple way that find_in_set allows me retrieve the index of an element within the list/set e.g. mysql> SELECT FIND_IN_SET('c','a,b,c,d'); +----------------------------+ | FIND_IN_SET('c','a,b,c,d') | +----------------------------+ | 3 | +----------------------------+ 1 row in set (0.00 sec) what I want is a "reverse find_in_set lookup" like mysql>SELECT GETELEMENT_AT_INDEX_IN_SET(3.'a,b,c,d'); ... answer would be +------------------------------------------------------+ | GETELEMENT_AT_INDEX_IN_SET(3.'a,b,c,d');| +-------------------------------------------------------+ | c | +----------------------------+ 1 row in set (0.00 sec) (thats just made up) I ' ve found away to achieve this by using various string function but I'm quite sure there must be a better and more effective and simpler way: mysql> SELECT MID('a,b,c,d', length(substring_index('a,b,c,d',",",3)), length(substring_index('a,b,c,d',",",4))- length(substring_index('a,b,c,d',",",3))-1 ) complicated; +-------------+ | complicated | +-------------+ | c | +-------------+ 1 row in set (0.00 sec) I know I could store that into a function and have my own function to accomplish this rather nicely but isn't there a built in function to achieve this? Any hints or tips or reference I didnt find? TIA Chris |
#3
| |||
| |||
|
|
Hi all, ... Any hints or tips or reference I didnt find? |
|
TIA Chris |
#4
| ||||||
| ||||||
|
|
On May 5, 3:17 pm, "cvh@LE" <christian.han... (AT) cpi-service (DOT) com> wrote: Hi all, ... Any hints or tips or reference I didnt find? RTFM would reveal SUBSTRING_INDEX():http://dev.mysql.com/doc/refman/5.0/....html#function... TIA Chris |
|
mysql> SELECT MID('a,b,c,d', length(substring_index('a,b,c,d',",",3)), length(substring_index('a,b,c,d',",",4))- length(substring_index('a,b,c,d',",",3))-1 ) complicated; |
|
element_in_set('So this is a sample set separated by spaces'," ",4) +--------------------------------------------------------------------- |
|
sample +--------------------------------------------------------------------- |
|
element_in_set('this,is,a,comma,separated,set,with ,9,elements',",", 6) | |
|
with +----------------------------------------------------------------------- |
#5
| |||
| |||
|
|
On May 5, 7:13 pm, toby <t... (AT) telegraphics (DOT) com.au> wrote: On May 5, 3:17 pm, "cvh@LE" <christian.han... (AT) cpi-service (DOT) com> wrote: Hi all, ... Any hints or tips or reference I didnt find? RTFM would reveal SUBSTRING_INDEX():http://dev.mysql.com/doc/refman/5.0/....html#function... TIA Chris RTFM??!? If you had read what I 've written you would have realised that I am well aware of substring_index: mysql> SELECT MID('a,b,c,d', length(substring_index('a,b,c,d',",",3)), length(substring_index('a,b,c,d',",",4))- length(substring_index('a,b,c,d',",",3))-1 ) complicated; It rather seems to me that you dont know what a SET is! So RTFM to you. If you dont have valueble input than just leave it to others to give advice. substring index is helpful function I do use to achieve the goal of returning a string As said , I solved the problem but am wondering if there does exist a built-in function like FIND_IN_SET which takes a index as parameter and not a string-element. I created my own function as follows : CREATE FUNCTION element_in_set (mystring varchar(4096),sep varchar(25), myindex int(4)) RETURNS varchar(4096) DETERMINISTIC BEGIN RETURN replace(substr(mystring,length(substring_index(mys tring,sep,myindex)) +1,length(substring_index(mystring,sep,myindex+1)) - length(substring_index(mystring,sep,myindex))),sep ,""); END $$ mysql> -- Select the 5th (index=4) element from a space separated set mysql> select element_in_set('So this is a sample set separated by spaces'," ",4); +--------------------------------------------------------------------- + | element_in_set('So this is a sample set separated by spaces'," ",4) | +--------------------------------------------------------------------- + | sample | +--------------------------------------------------------------------- + 1 row in set (0.00 sec) mysql> -- finding the 7th ( index=6) element in aa comma separated set mysql> select element_in_set('this,is,a,comma,separated,set,with , 9,elements',",",6); +----------------------------------------------------------------------- + | element_in_set('this,is,a,comma,separated,set,with ,9,elements',",", 6) | +----------------------------------------------------------------------- + | with | +----------------------------------------------------------------------- + 1 row in set (0.00 sec) So is there a built-in function? @Jerry; Unfortunately this is not an option in this case. The database is to be taken as is, but I am afraid I cant provide more information on this. So just trust me on this. |
#6
| |||
| |||
|
#7
| |||
| |||
|
|
@Jerry : As said, I am neither the admin nor have I the privileges to normalize the db. Furthermore normalizing this column would create more overhead than performance gain could be won. |
#8
| ||||
| ||||
|
|
cvh@LE wrote: @Jerry : As said, I am neither the admin nor have I the privileges to normalize the db. Furthermore normalizing this column would create more overhead than performance gain could be won. Not having the authority is one thing. *Claiming it will create more overhead is another - and probably FALSE. Very seldom have I actually SEEN a degradation of performance when normalizing a database. *And virtually never when normalizing a column with multiple values. *That almost always improves performance across the board. *And it definitely improves performance on this particular query. It was true that normalization hurt database performance - 25 years ago. * But things have come a long ways since then, and RDBMS's typically do better with normalized data than not. Of course, there are always exceptions. *But until you actually take performance benchmarks with your statements and your data, you can't say that. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstuck... (AT) attglobal (DOT) net ================== |
|
pk | col1 | +------+------+ 1 | a | 2 | b | 1 | c | 2 | e | 1 | f | 2 | g | +------+------+ |
|
pk | group_concat(col1 separator ",") | +------+----------------------------------+ 1 | a,c,f | 2 | b,e,g | +------+----------------------------------+ |
|
pk | 2ndElement | +------+------------+ 1 | c | 2 | e | +------+------------+ |
#9
| |||
| |||
|
|
On 6 Mai, 13:09, Jerry Stuckle <jstuck... (AT) attglobal (DOT) net> wrote: cvh@LE wrote: @Jerry : As said, I am neither the admin nor have I the privileges to normalize the db. Furthermore normalizing this column would create more overhead than performance gain could be won. Not having the authority is one thing. Claiming it will create more overhead is another - and probably FALSE. Very seldom have I actually SEEN a degradation of performance when normalizing a database. And virtually never when normalizing a column with multiple values. That almost always improves performance across the board. And it definitely improves performance on this particular query. It was true that normalization hurt database performance - 25 years ago. But things have come a long ways since then, and RDBMS's typically do better with normalized data than not. Of course, there are always exceptions. But until you actually take performance benchmarks with your statements and your data, you can't say that. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstuck... (AT) attglobal (DOT) net ================== @Jerry I absolutely agree with you on normalisation. And I, too, have seldomly (although I do have) experienced that normalization is a performance downer. However, the application-database which makes use of this column- containing-a-set problem is normalized already, except for this column which happens to contain data similar to comments. Comments mostly are formulated as sentences, which happens to be nothing more than an ordered space separated set. This column contains exactly the same. I doubt anybody would normalize a short one-sentence comment field in a table holding, let's say, additional remarks (3NF) given as part of responses (2NF) by respondents (1NF) in surveys. Anyway, the whole question of mine targeted a function allowing me to retrieve the nth element in a set (I agree, the word occurence was the wrong choice - but, hey, English isnt my native tongue). To illustrate the problem a bit: The column in question containing the set could itself be the result of an aggregate function : imagine : mysql> create table test1 (pk int(1),col1 varchar(20)); Query OK, 0 rows affected (0.00 sec) mysql> insert into test1(pk,col1) values (1,"a"),(2,"b"),(1,"c"), (2,"e"),(1,"f"),(2,"g"); Query OK, 6 rows affected (0.01 sec) Records: 6 Duplicates: 0 Warnings: 0 mysql> select * from test1; +------+------+ | pk | col1 | +------+------+ | 1 | a | | 2 | b | | 1 | c | | 2 | e | | 1 | f | | 2 | g | +------+------+ 6 rows in set (0.01 sec) mysql> select pk,group_concat(col1 separator ",") from test1 group by 1; +------+----------------------------------+ | pk | group_concat(col1 separator ",") | +------+----------------------------------+ | 1 | a,c,f | | 2 | b,e,g | +------+----------------------------------+ 2 rows in set (0.00 sec) mysql> select pk,metadaten.element_in_set(group_concat(col1 separator ","),",",2) 2ndElement from test1 group by 1; +------+------------+ | pk | 2ndElement | +------+------------+ | 1 | c | | 2 | e | +------+------------+ 2 rows in set (0.00 sec) mysql |
#10
| |||
| |||
|
|
cvh@LE wrote: @Jerry I absolutely agree with you on normalisation. And I, too, have seldomly (although I do have) experienced that normalization is a performance downer. However, the application-database which makes use of this column- containing-a-set problem is normalized already, except for this column which happens to contain data similar to comments. Comments mostly are formulated as sentences, which happens to be nothing more than an ordered space separated set. This column contains exactly the same. I doubt anybody would normalize a short one-sentence comment field in a table holding, let's say, additional remarks (3NF) given as part of responses (2NF) by respondents (1NF) in surveys. Yes, I understand your design. But there are no functions to retrieve information form a set because the set itself violates 1NF. And if the database is normalized properly, there is no need. And yes, I would normalize such a field in such circumstances. |
![]() |
| Thread Tools | |
| Display Modes | |
| |