![]() | |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hi, suppose I want to store log messages in a table, or I want to have translations of messages in a table. To avoid duplicates, many programmers would compute a hash on the text and use it for indexing. I wonder if the postgres indexing facilities would handle longer text columns well by itself, so there is not much to gain by hashing. I am talking about modest size tables, in web applications. /Str. |
#3
| |||
| |||
|
|
Hi, suppose I want to store log messages in a table, or I want to have translations of messages in a table. To avoid duplicates, many programmers would compute a hash on the text and use it for indexing. I wonder if the postgres indexing facilities would handle longer text columns well by itself, so there is not much to gain by hashing. I am talking about modest size tables, in web applications. /Str. |
#4
| |||
| |||
|
|
From: http://www.postgresql.org/docs/8.3/s...xes-types.html "Hash indexes can only handle simple equality comparisons. The query planner will consider using a hash index whenever an indexed column is involved in a comparison using the = operator. (But hash indexes do not support IS NULL searches.) The following command is used to create a hash index: CREATE INDEX name ON table USING hash (column); Note: Testing has shown PostgreSQL's hash indexes to perform no better than B-tree indexes, and the index size and build time for hash indexes is much worse. Furthermore, hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash. For these reasons, hash index use is presently discouraged." Il 03/03/2011 9.29, M. Strobel ha scritto: Hi, suppose I want to store log messages in a table, or I want to have translations of messages in a table. To avoid duplicates, many programmers would compute a hash on the text and use it for indexing. I wonder if the postgres indexing facilities would handle longer text columns well by itself, so there is not much to gain by hashing. I am talking about modest size tables, in web applications. /Str. |
#5
| |||
| |||
|
|
On Thu, 03 Mar 2011 10:29:53 +0200, M. Strobel wrote: Hi, suppose I want to store log messages in a table, or I want to have translations of messages in a table. To avoid duplicates, many programmers would compute a hash on the text and use it for indexing. I wonder if the postgres indexing facilities would handle longer text columns well by itself, so there is not much to gain by hashing. I am talking about modest size tables, in web applications. /Str. I don't understand the question. Do you mean normal B-Tree index or text index? Postgres does have text indexing engine, it's called Tsearch2, and if you only need to search for words, it's an excellent choice. If you need more capabilities than Tsearch2 provides (phrases, near operator), you can try with Sphinx, Lucene or Xapian. All of those engines are geared toward large text documents and can handle multi-MB documents with ease. |
#6
| |||
| |||
|
|
Thanks, I did not know about hash indexes, I was only aware of the B-tree. I want to look up the whole text. I just noticed that much more data would be sent to the database when looking up the text, instead of a self made hash. It still would be interesting to know how well b-tree indexes perform on longer text columns. /Str. |
![]() |
| Thread Tools | |
| Display Modes | |
| |