![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hi there, i have some troubles with my TSearch2 Installation. I have done this installation as described in http://www.sai.msu.su/~megera/oddmus...compound_words http://www.sai.msu.su/%7Emegera/oddm...compound_words I used the german myspell dictionary from http://lingucomponent.openoffice.org/spell_dic.html and converted it with my2ispell Nearly everything is working fine so far, except two problems: 1.) The stopword-file seems to be ignored: If i try it with SELECT to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2 ein should be a Stopword for german (and is defined the german.stop file as well) 2.) The compound words feature doesn"t work too. I have tried a lot of words, i.e. "Fehlermeldung" with SELECT to_tsvector("default_german", "Fehlermeldung") i only get "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated entries. Is there anything wrong with the dictonary or my configuration? My current configuration: pg_ts_cfg: default default C default_russian default ru_RU.KOI8-R simple default NULL default_german default de_DE.ISO8859-1 pg_ts_cfgmap: default_german host {simple} default_german hword {simple} default_german int {simple} default_german nlhword {simple} default_german nlpart_hword {simple} default_german nlword {simple} default_german part_hword {simple} default_german sfloat {simple} default_german uint {simple} default_german uri {simple} default_german url {simple} default_german version {simple} default_german word {simple} default_german lpart_hword {de_ispell,german_snowball} default_german lword {de_ispell,german_snowball} default_german lhword {de_ispell,german_snowball} pg_ts_dict: de_ispell | 17166 | DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict", AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff", StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167 | NULL german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german Can anyone help me? regards Timo ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
#3
| |||
| |||
|
|
Hi there, i have some troubles with my TSearch2 Installation. I have done this installation as described in http://www.sai.msu.su/~megera/oddmus...compound_words http://www.sai.msu.su/%7Emegera/oddm...compound_words I used the german myspell dictionary from http://lingucomponent.openoffice.org/spell_dic.html and converted it with my2ispell Nearly everything is working fine so far, except two problems: 1.) The stopword-file seems to be ignored: If i try it with SELECT to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2 ein should be a Stopword for german (and is defined the german.stop file as well) 2.) The compound words feature doesn"t work too. I have tried a lot of words, i.e. "Fehlermeldung" with SELECT to_tsvector("default_german", "Fehlermeldung") i only get "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated entries. Is there anything wrong with the dictonary or my configuration? My current configuration: pg_ts_cfg: default default C default_russian default ru_RU.KOI8-R simple default NULL default_german default de_DE.ISO8859-1 pg_ts_cfgmap: default_german host {simple} default_german hword {simple} default_german int {simple} default_german nlhword {simple} default_german nlpart_hword {simple} default_german nlword {simple} default_german part_hword {simple} default_german sfloat {simple} default_german uint {simple} default_german uri {simple} default_german url {simple} default_german version {simple} default_german word {simple} default_german lpart_hword {de_ispell,german_snowball} default_german lword {de_ispell,german_snowball} default_german lhword {de_ispell,german_snowball} pg_ts_dict: de_ispell | 17166 | DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict", AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff", StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167 | NULL german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german Can anyone help me? regards Timo ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
#4
| |||
| |||
|
|
Timo, please, check you apply patch for compound word support. What is version of postgresql ? Does ispell dict works for non-compound words ? Oleg On Fri, 5 Nov 2004, Timo Haberkern wrote: Hi there, i have some troubles with my TSearch2 Installation. I have done this installation as described in http://www.sai.msu.su/~megera/oddmus...compound_words http://www.sai.msu.su/%7Emegera/oddm...compound_words I used the german myspell dictionary from http://lingucomponent.openoffice.org/spell_dic.html and converted it with my2ispell Nearly everything is working fine so far, except two problems: 1.) The stopword-file seems to be ignored: If i try it with SELECT to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2 ein should be a Stopword for german (and is defined the german.stop file as well) 2.) The compound words feature doesn"t work too. I have tried a lot of words, i.e. "Fehlermeldung" with SELECT to_tsvector("default_german", "Fehlermeldung") i only get "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated entries. Is there anything wrong with the dictonary or my configuration? My current configuration: pg_ts_cfg: default default C default_russian default ru_RU.KOI8-R simple default NULL default_german default de_DE.ISO8859-1 pg_ts_cfgmap: default_german host {simple} default_german hword {simple} default_german int {simple} default_german nlhword {simple} default_german nlpart_hword {simple} default_german nlword {simple} default_german part_hword {simple} default_german sfloat {simple} default_german uint {simple} default_german uri {simple} default_german url {simple} default_german version {simple} default_german word {simple} default_german lpart_hword {de_ispell,german_snowball} default_german lword {de_ispell,german_snowball} default_german lhword {de_ispell,german_snowball} pg_ts_dict: de_ispell | 17166 | DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict", AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff", StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167 | NULL german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german Can anyone help me? regards Timo ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg __________________________________________________ ___________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg (AT) sai (DOT) msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo (AT) postgresql (DOT) org) |
#5
| |||
| |||
|
|
Oleg, i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch yesterday. The configuration changed a little bit but the result is the same. I get no compound words. I'm using the locale de_DE with encoding ISO8859-1 for the database. I think i spell is working correctly except the compound words. If i try SELECT lexize('de_ispell', 'springt') i get lexize {springen,springen} which seems correct. But a SELECT lexize('de_ispell', 'Autobahn') results in lexize {autobahn} i would expect {auto,bahn, autobahn} |
|
The new configuration after the compound word patch: |

|
Actions dict_name http://www.rotex-service.com/phppgad...xpanded&page=1 dict_init http://www.rotex-service.com/phppgad...xpanded&page=1 dict_initoption http://www.rotex-service.com/phppgad...xpanded&page=1 dict_lexize http://www.rotex-service.com/phppgad...xpanded&page=1 dict_comment http://www.rotex-service.com/phppgad...xpanded&page=1 Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= simple dex_init(text) /NULL/ dex_lexize(internal,internal,integer) Simple example of dictionary. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= en_stem snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop snb_lexize(internal,internal,integer) English Stemmer. Snowball. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= ru_stem snb_ru_init(text) /usr/local/pgsql/share/contrib/russian.stop snb_lexize(internal,internal,integer) Russian Stemmer. Snowball. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= ispell_template spell_init(text) /NULL/ spell_lexize(internal,internal,integer) ISpell interface. Must have .dict and .aff files Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= synonym syn_init(text) /NULL/ syn_lexize(internal,internal,integer) Example of synonym dictionary Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= de_ispell spell_init(text) DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict", AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff", StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" spell_lexize(internal,internal,integer) /NULL/ Timo Oleg Bartunov wrote: Timo, please, check you apply patch for compound word support. What is version of postgresql ? Does ispell dict works for non-compound words ? Oleg On Fri, 5 Nov 2004, Timo Haberkern wrote: Hi there, i have some troubles with my TSearch2 Installation. I have done this installation as described in http://www.sai.msu.su/~megera/oddmus...compound_words http://www.sai.msu.su/%7Emegera/oddm...compound_words I used the german myspell dictionary from http://lingucomponent.openoffice.org/spell_dic.html and converted it with my2ispell Nearly everything is working fine so far, except two problems: 1.) The stopword-file seems to be ignored: If i try it with SELECT to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2 ein should be a Stopword for german (and is defined the german.stop file as well) 2.) The compound words feature doesn"t work too. I have tried a lot of words, i.e. "Fehlermeldung" with SELECT to_tsvector("default_german", "Fehlermeldung") i only get "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated entries. Is there anything wrong with the dictonary or my configuration? My current configuration: pg_ts_cfg: default default C default_russian default ru_RU.KOI8-R simple default NULL default_german default de_DE.ISO8859-1 pg_ts_cfgmap: default_german host {simple} default_german hword {simple} default_german int {simple} default_german nlhword {simple} default_german nlpart_hword {simple} default_german nlword {simple} default_german part_hword {simple} default_german sfloat {simple} default_german uint {simple} default_german uri {simple} default_german url {simple} default_german version {simple} default_german word {simple} default_german lpart_hword {de_ispell,german_snowball} default_german lword {de_ispell,german_snowball} default_german lhword {de_ispell,german_snowball} pg_ts_dict: de_ispell | 17166 | DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict", AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff", StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167 | NULL german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german Can anyone help me? regards Timo ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg __________________________________________________ ___________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg (AT) sai (DOT) msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo (AT) postgresql (DOT) org) |
#6
| |||
| |||
|
|
On Fri, 5 Nov 2004, Timo Haberkern wrote: Oleg, i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch yesterday. The configuration changed a little bit but the result is the same. I get no compound words. I'm using the locale de_DE with encoding ISO8859-1 for the database. I think i spell is working correctly except the compound words. If i try SELECT lexize('de_ispell', 'springt') i get lexize {springen,springen} which seems correct. But a SELECT lexize('de_ispell', 'Autobahn') results in lexize {autobahn} i would expect {auto,bahn, autobahn} Hmm, have you checked 'Autobahn' in ispell dictionary ? Does dictionary you used supports 'Z' flag for compound words ? |
| The new configuration after the compound word patch: Seems you overestimate my capabilities ![]() Actions dict_name http://www.rotex-service.com/phppgad...xpanded&page=1 dict_init http://www.rotex-service.com/phppgad...xpanded&page=1 dict_initoption http://www.rotex-service.com/phppgad...xpanded&page=1 dict_lexize http://www.rotex-service.com/phppgad...xpanded&page=1 dict_comment http://www.rotex-service.com/phppgad...xpanded&page=1 Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= simple dex_init(text) /NULL/ dex_lexize(internal,internal,integer) Simple example of dictionary. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= en_stem snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop snb_lexize(internal,internal,integer) English Stemmer. Snowball. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= ru_stem snb_ru_init(text) /usr/local/pgsql/share/contrib/russian.stop snb_lexize(internal,internal,integer) Russian Stemmer. Snowball. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= ispell_template spell_init(text) /NULL/ spell_lexize(internal,internal,integer) ISpell interface. Must have .dict and .aff files Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= synonym syn_init(text) /NULL/ syn_lexize(internal,internal,integer) Example of synonym dictionary Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= de_ispell spell_init(text) DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict", AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff", StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" spell_lexize(internal,internal,integer) /NULL/ Timo Oleg Bartunov wrote: Timo, please, check you apply patch for compound word support. What is version of postgresql ? Does ispell dict works for non-compound words ? Oleg On Fri, 5 Nov 2004, Timo Haberkern wrote: Hi there, i have some troubles with my TSearch2 Installation. I have done this installation as described in http://www.sai.msu.su/~megera/oddmus...compound_words http://www.sai.msu.su/%7Emegera/oddm...compound_words I used the german myspell dictionary from http://lingucomponent.openoffice.org/spell_dic.html and converted it with my2ispell Nearly everything is working fine so far, except two problems: 1.) The stopword-file seems to be ignored: If i try it with SELECT to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2 ein should be a Stopword for german (and is defined the german.stop file as well) 2.) The compound words feature doesn"t work too. I have tried a lot of words, i.e. "Fehlermeldung" with SELECT to_tsvector("default_german", "Fehlermeldung") i only get "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated entries. Is there anything wrong with the dictonary or my configuration? My current configuration: pg_ts_cfg: default default C default_russian default ru_RU.KOI8-R simple default NULL default_german default de_DE.ISO8859-1 pg_ts_cfgmap: default_german host {simple} default_german hword {simple} default_german int {simple} default_german nlhword {simple} default_german nlpart_hword {simple} default_german nlword {simple} default_german part_hword {simple} default_german sfloat {simple} default_german uint {simple} default_german uri {simple} default_german url {simple} default_german version {simple} default_german word {simple} default_german lpart_hword {de_ispell,german_snowball} default_german lword {de_ispell,german_snowball} default_german lhword {de_ispell,german_snowball} pg_ts_dict: de_ispell | 17166 | DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict", AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff", StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167 | NULL german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german Can anyone help me? regards Timo ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg __________________________________________________ ___________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg (AT) sai (DOT) msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo (AT) postgresql (DOT) org) Regards, Oleg __________________________________________________ ___________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg (AT) sai (DOT) msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo (AT) postgresql (DOT) org) |
#7
| |||
| |||
|
|
sorry for the late answer, i was on holyday, see my remarks below Oleg Bartunov wrote: On Fri, 5 Nov 2004, Timo Haberkern wrote: Oleg, i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch yesterday. The configuration changed a little bit but the result is the same. I get no compound words. I'm using the locale de_DE with encoding ISO8859-1 for the database. I think i spell is working correctly except the compound words. If i try SELECT lexize('de_ispell', 'springt') i get lexize {springen,springen} which seems correct. But a SELECT lexize('de_ispell', 'Autobahn') results in lexize {autobahn} i would expect {auto,bahn, autobahn} Hmm, have you checked 'Autobahn' in ispell dictionary ? Does dictionary you used supports 'Z' flag for compound words ? Autobahn is in the ispell dictionary. What does a ispell dictionary need to support the Z flag??? |
|
Timo The new configuration after the compound word patch: Seems you overestimate my capabilities ![]() Actions dict_name http://www.rotex-service.com/phppgad...xpanded&page=1 dict_init http://www.rotex-service.com/phppgad...xpanded&page=1 dict_initoption http://www.rotex-service.com/phppgad...xpanded&page=1 dict_lexize http://www.rotex-service.com/phppgad...xpanded&page=1 dict_comment http://www.rotex-service.com/phppgad...xpanded&page=1 Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= simple dex_init(text) /NULL/ dex_lexize(internal,internal,integer) Simple example of dictionary. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= en_stem snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop snb_lexize(internal,internal,integer) English Stemmer. Snowball. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= ru_stem snb_ru_init(text) /usr/local/pgsql/share/contrib/russian.stop snb_lexize(internal,internal,integer) Russian Stemmer. Snowball. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= ispell_template spell_init(text) /NULL/ spell_lexize(internal,internal,integer) ISpell interface. Must have .dict and .aff files Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= synonym syn_init(text) /NULL/ syn_lexize(internal,internal,integer) Example of synonym dictionary Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= de_ispell spell_init(text) DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict", AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff", StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" spell_lexize(internal,internal,integer) /NULL/ Timo Oleg Bartunov wrote: Timo, please, check you apply patch for compound word support. What is version of postgresql ? Does ispell dict works for non-compound words ? Oleg On Fri, 5 Nov 2004, Timo Haberkern wrote: Hi there, i have some troubles with my TSearch2 Installation. I have done this installation as described in http://www.sai.msu.su/~megera/oddmus...compound_words http://www.sai.msu.su/%7Emegera/oddm...compound_words I used the german myspell dictionary from http://lingucomponent.openoffice.org/spell_dic.html and converted it with my2ispell Nearly everything is working fine so far, except two problems: 1.) The stopword-file seems to be ignored: If i try it with SELECT to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2 ein should be a Stopword for german (and is defined the german.stop file as well) 2.) The compound words feature doesn"t work too. I have tried a lot of words, i.e. "Fehlermeldung" with SELECT to_tsvector("default_german", "Fehlermeldung") i only get "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated entries. Is there anything wrong with the dictonary or my configuration? My current configuration: pg_ts_cfg: default default C default_russian default ru_RU.KOI8-R simple default NULL default_german default de_DE.ISO8859-1 pg_ts_cfgmap: default_german host {simple} default_german hword {simple} default_german int {simple} default_german nlhword {simple} default_german nlpart_hword {simple} default_german nlword {simple} default_german part_hword {simple} default_german sfloat {simple} default_german uint {simple} default_german uri {simple} default_german url {simple} default_german version {simple} default_german word {simple} default_german lpart_hword {de_ispell,german_snowball} default_german lword {de_ispell,german_snowball} default_german lhword {de_ispell,german_snowball} pg_ts_dict: de_ispell | 17166 | DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict", AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff", StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167 | NULL german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german Can anyone help me? regards Timo ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg __________________________________________________ ___________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg (AT) sai (DOT) msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo (AT) postgresql (DOT) org) Regards, Oleg __________________________________________________ ___________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg (AT) sai (DOT) msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo (AT) postgresql (DOT) org) |
#8
| |||
| |||
|
|
On Wed, 17 Nov 2004, Timo Haberkern wrote: sorry for the late answer, i was on holyday, see my remarks below Oleg Bartunov wrote: On Fri, 5 Nov 2004, Timo Haberkern wrote: Oleg, i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch yesterday. The configuration changed a little bit but the result is the same. I get no compound words. I'm using the locale de_DE with encoding ISO8859-1 for the database. I think i spell is working correctly except the compound words. If i try SELECT lexize('de_ispell', 'springt') i get lexize {springen,springen} which seems correct. But a SELECT lexize('de_ispell', 'Autobahn') results in lexize {autobahn} i would expect {auto,bahn, autobahn} Hmm, have you checked 'Autobahn' in ispell dictionary ? Does dictionary you used supports 'Z' flag for compound words ? Autobahn is in the ispell dictionary. What does a ispell dictionary need to support the Z flag??? Try ispell -C Autobahn search 'compound' in 'man ispell' for details. the problem exists only if ispell *does* splits word correctly while tsearch2 doesn't. You should find correct ispell dictionary for german or create it yourself. You may consult monzilla.net http://staff.science.uva.nl/~christo...roject-dr.html Timo The new configuration after the compound word patch: Seems you overestimate my capabilities ![]() Actions dict_name http://www.rotex-service.com/phppgad...xpanded&page=1 dict_init http://www.rotex-service.com/phppgad...xpanded&page=1 dict_initoption http://www.rotex-service.com/phppgad...xpanded&page=1 dict_lexize http://www.rotex-service.com/phppgad...xpanded&page=1 dict_comment http://www.rotex-service.com/phppgad...xpanded&page=1 Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= simple dex_init(text) /NULL/ dex_lexize(internal,internal,integer) Simple example of dictionary. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= en_stem snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop snb_lexize(internal,internal,integer) English Stemmer. Snowball. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= ru_stem snb_ru_init(text) /usr/local/pgsql/share/contrib/russian.stop snb_lexize(internal,internal,integer) Russian Stemmer. Snowball. Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= ispell_template spell_init(text) /NULL/ spell_lexize(internal,internal,integer) ISpell interface. Must have .dict and .aff files Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= synonym syn_init(text) /NULL/ syn_lexize(internal,internal,integer) Example of synonym dictionary Edit http://www.rotex-service.com/phppgad...tkey=&sortdir= Delete http://www.rotex-service.com/phppgad...tkey=&sortdir= de_ispell spell_init(text) DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict", AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff", StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" spell_lexize(internal,internal,integer) /NULL/ Timo Oleg Bartunov wrote: Timo, please, check you apply patch for compound word support. What is version of postgresql ? Does ispell dict works for non-compound words ? Oleg On Fri, 5 Nov 2004, Timo Haberkern wrote: Hi there, i have some troubles with my TSearch2 Installation. I have done this installation as described in http://www.sai.msu.su/~megera/oddmus...compound_words http://www.sai.msu.su/%7Emegera/oddm...compound_words I used the german myspell dictionary from http://lingucomponent.openoffice.org/spell_dic.html and converted it with my2ispell Nearly everything is working fine so far, except two problems: 1.) The stopword-file seems to be ignored: If i try it with SELECT to_tsvector("default_german", "ein Haus") i get "ein":1 "haus":2 ein should be a Stopword for german (and is defined the german.stop file as well) 2.) The compound words feature doesn"t work too. I have tried a lot of words, i.e. "Fehlermeldung" with SELECT to_tsvector("default_german", "Fehlermeldung") i only get "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated entries. Is there anything wrong with the dictonary or my configuration? My current configuration: pg_ts_cfg: default default C default_russian default ru_RU.KOI8-R simple default NULL default_german default de_DE.ISO8859-1 pg_ts_cfgmap: default_german host {simple} default_german hword {simple} default_german int {simple} default_german nlhword {simple} default_german nlpart_hword {simple} default_german nlword {simple} default_german part_hword {simple} default_german sfloat {simple} default_german uint {simple} default_german uri {simple} default_german url {simple} default_german version {simple} default_german word {simple} default_german lpart_hword {de_ispell,german_snowball} default_german lword {de_ispell,german_snowball} default_german lhword {de_ispell,german_snowball} pg_ts_dict: de_ispell | 17166 | DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict", AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff", StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop" | 17167 | NULL german_snowball | 17357 | NULL | 17162 | Snowball stemmer for german Can anyone help me? regards Timo ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster Regards, Oleg __________________________________________________ ___________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg (AT) sai (DOT) msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo (AT) postgresql (DOT) org) Regards, Oleg __________________________________________________ ___________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg (AT) sai (DOT) msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo (AT) postgresql (DOT) org) Regards, Oleg __________________________________________________ ___________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg (AT) sai (DOT) msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo (AT) postgresql (DOT) org) |
![]() |
| Thread Tools | |
| Display Modes | |
| |