![]() | |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
Hi, If the nature of the data to be stored in a database is like the following, what is a more efficient method of storing them? For a table in the database, there are a large number of possible fields (mostly of type integer, float and strings), but each row has a different set of fields available and other fields are N/A. Field sets of rows overlap some but still have much non-overlaping. For example, all possible files are F1, F2, ..., F50. Row 1 only has F1, F6, F8, ..., F47, F49. Row 2 has F1, F2, F5, F6, F10, ..., F46, F49, F50. And so on. If the table is created with a schema including all possible files (F1 through F50), many fields in the database is just empty which is not very efficient. What is a better way of storing this kind of data? Is document-oriented database (XML) better? Are there better ones than XML? Thanks, Stan I just want to add that it is not obvious how to break all fields into |
#3
| |||
| |||
|
|
For a table in the database, there are a large number of possible fields (mostly of type integer, float and strings), but each row has a |
#4
| |||
| |||
|
#5
| |||
| |||
|
|
If the nature of the data to be stored in a database is like the following, what is a more efficient method of storing them? For a table in the database, there are a large number of possible fields (mostly of type integer, float and strings), but each row has a different set of fields available and other fields are N/A. Field sets of rows overlap some but still have much non-overlaping. For example, all possible files are F1, F2, ..., F50. Row 1 only has F1, F6, F8, ..., F47, F49. Row 2 has F1, F2, F5, F6, F10, ..., F46, F49, F50. And so on. If the table is created with a schema including all possible files (F1 through F50), many fields in the database is just empty which is not very efficient. What is a better way of storing this kind of data? Is document-oriented database (XML) better? Are there better ones than XML? |
#6
| |||
| |||
|
|
Hello Stan, These sparse databases can be handled in a number of ways. One way is to break the table into a 'star' arrangement. A central table contains an any key information that each given row in your original table would need to identify it, plus a unique identifier. All of the other fields are stored in appropriate groups in other tables. These tables have an extra column to store the unique identifier from the central table. Then to query the data you do a join between the central table and those other tables containing the contents that you are interested in. The resulting view looks like your sparse table, but takes up less storage. |
![]() |
| Thread Tools | |
| Display Modes | |
| |