![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Dear, well, database experts? I am seeking to design a single-user database backend that is well- suited to a somewhat impure task: |
#3
| |||
| |||
|
|
Dear, well, database experts? I am seeking to design a single-user database backend that is well- suited to a somewhat impure task: |
#4
| |||
| |||
|
#5
| |||||||||
| |||||||||
|
|
Dear, well, database experts? I am seeking to design a single-user database backend that is well- suited to a somewhat impure task: I would like to store so-called translation memories in it and query them for individual words, or, if reasonably feasible, even patterns, such as regular expressions. Translation memories are effectively long lists of pairs of sentences, one in one language, the other one in the other (you may also have more than two languages, but that's not the point here). Each pair usually also has an ID. In a table, thus, a record would basically look sth like this (should look right when viewed with a monospaced font): ID * English * * * * * * German 1 * *The quick brown * * Der flinke braune * * *fox jumped over * * Fuchs sprang über * * *the lazy dog. * * * den trägen Hund. I am aware that I can retrieve such records using a LIKE statement, but I understand what I am about to do here is not exactly what database theory is about since the data are not atomic. Put in another way, I understand that if you always query a table using LIKE statements, it is probably poorly designed (or so I seem to have gatherered). Yet I don't see how I could make these data atomic. The whole point of this is that searching for such matches in a lot of files is unacceptably slow, and I hope that even if this might not be a classic database task, the latter should be much faster at it. I would thus have two questions: 1, is there a standard design solution to this problem? An "index", perhaps? |
|
.... 2, which SQL DBMS would you think is best suited to this task? |
|
To make the choices more limited, I would also have the following requirements: a) the DBMS should be open source or, at least, free |
|
b) it should run on Linux and Windows (in a native version), and it should be possible to use the same database under both OSes (e.g. boot Linux and store records, then boot Windows and query them). |
|
c) it should be space-, rather than time-efficient, i.e. it should be good about using as little disk-space as possible (as to the encoding, iso-8859-1 would be sufficient). (I know, the most space-efficient solution would be not to use a DBMS, but to search text files, but that takes SO long I hope it is possible to achieve a good compromise with a DBMS.) |
|
d) ideally, it would also offer sth like "lazy loading" of tables, i.e. only load them into memory when they are queried (but that is perhaps not that crucial) |
|
I have run an attempt with MySQL, and apparently run into problems as to requirement b). I figure the reason might be that I haven't been able to get exactly matching versions for the two OSes. |
|
That's probably enough for one question |
|
... I would be very glad if you let me know your opinion. As might be apparent from what I've written, I'm new to databases but have read a book on SQL, so I hope to have some measure of grasp. Thanks very much in advance! Florian |
#6
| ||||||
| ||||||
|
|
Specialized indices, yes. Look into Oracle® Text Application Developer's Guide |
|
It should be possible to use the same database under both OSes (e.g. boot Linux and store records, then boot Windows and query them). the second part of your requirement is yuck. I know of no DBMS that can work that way, except passibly some flat file databases. |
|
You are going to have to compromise. |
|
I'd still be very surprised to hear this [i.e. the above] works even if you got exactly the same versions of MySQL in both OS's. Is this something MySQL claims to do? |
|
I hope you don't think reading a[n] SQL book tells you anything about Databases. |
|
You have a lot of learning to do. |
![]() |
| Thread Tools | |
| Display Modes | |
| |