dbTalk Databases Forums  

data with varying fields compositioin

comp.databases comp.databases


Discuss data with varying fields compositioin in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Stanley Yao
 
Posts: n/a

Default data with varying fields compositioin - 02-10-2004 , 11:42 AM






Hi,
If the nature of the data to be stored in a database is like the
following, what is a more efficient method of storing them?
For a table in the database, there are a large number of possible
fields (mostly of type integer, float and strings), but each row has a
different set of fields available and other fields are N/A. Field sets
of rows overlap some but still have much non-overlaping. For example,
all possible files are F1, F2, ..., F50. Row 1 only has F1, F6, F8, ...,
F47, F49. Row 2 has F1, F2, F5, F6, F10, ..., F46, F49, F50. And so on. If
the table is created with a schema including all possible files (F1
through F50), many fields in the database is just empty which is not very
efficient. What is a better way of storing this kind of data? Is
document-oriented database (XML) better? Are there better ones than XML?
Thanks,
Stan


Reply With Quote
  #2  
Old   
Stanley Yao
 
Posts: n/a

Default Re: data with varying fields compositioin - 02-10-2004 , 11:58 AM






On Tue, 10 Feb 2004, Stanley Yao wrote:

Quote:
Hi,
If the nature of the data to be stored in a database is like the
following, what is a more efficient method of storing them?
For a table in the database, there are a large number of possible
fields (mostly of type integer, float and strings), but each row has a
different set of fields available and other fields are N/A. Field sets
of rows overlap some but still have much non-overlaping. For example,
all possible files are F1, F2, ..., F50. Row 1 only has F1, F6, F8, ...,
F47, F49. Row 2 has F1, F2, F5, F6, F10, ..., F46, F49, F50. And so on. If
the table is created with a schema including all possible files (F1
through F50), many fields in the database is just empty which is not very
efficient. What is a better way of storing this kind of data? Is
document-oriented database (XML) better? Are there better ones than XML?
Thanks,
Stan

I just want to add that it is not obvious how to break all fields into
different tables. Sience it's science data, all fields are not clearly
dividable into groups; they are equally part of the whole row. It is
better to store all fields in the same table.
Thanks,
Stan



Reply With Quote
  #3  
Old   
--CELKO--
 
Posts: n/a

Default Re: data with varying fields compositioin - 02-11-2004 , 10:59 AM



Quote:
For a table in the database, there are a large number of possible
fields (mostly of type integer, float and strings), but each row has a
different set of fields available and other fields are N/A. <<

First of all, tables have rows and rows have columns. Tables are not
files; rows are not records and columns are fields. You have mixed a
file structure and an RDBMS together.

This data is not organized enough for an RDBMS yet. You might want to
keep it in an XML file or raw notes unitl you can organize it enough
for it to be put into a database.


Reply With Quote
  #4  
Old   
Nickgreenwood716
 
Posts: n/a

Default Re: data with varying fields compositioin - 02-14-2004 , 06:15 AM



Hello Stan,

These sparse databases can be handled in a number of ways.
One way is to break the table into a 'star' arrangement. A central table
contains an any key information that each given row in your original table
would need to identify it, plus a unique identifier. All of the other fields
are stored in appropriate groups in other tables. These tables have an extra
column to store the unique identifier from the central table. Then to query the
data you do a join between the central table and those other tables containing
the contents that you are interested in. The resulting view looks like your
sparse table, but takes up less storage.


Reply With Quote
  #5  
Old   
Neo
 
Posts: n/a

Default Re: data with varying fields compositioin - 02-14-2004 , 07:25 PM



Quote:
If the nature of the data to be stored in a database is like the
following, what is a more efficient method of storing them?
For a table in the database, there are a large number of possible
fields (mostly of type integer, float and strings), but each row has a
different set of fields available and other fields are N/A. Field sets
of rows overlap some but still have much non-overlaping. For example,
all possible files are F1, F2, ..., F50. Row 1 only has F1, F6, F8, ...,
F47, F49. Row 2 has F1, F2, F5, F6, F10, ..., F46, F49, F50. And so on. If
the table is created with a schema including all possible files (F1
through F50), many fields in the database is just empty which is not very
efficient. What is a better way of storing this kind of data? Is
document-oriented database (XML) better? Are there better ones than XML?
XDb is a small experimental database which allows each thing to have
different properties. XDb can present these things as tables and
trees. XDb can export trees as XML files and tables as tab-delimited
text files. See www.xdb1.com/Example/Ex102.asp for an example where
persons have different properties depending on their roles. If you
would like to provide more specifications and sample data I can create
a small db that you can play with.


Reply With Quote
  #6  
Old   
Ed prochak
 
Posts: n/a

Default Re: data with varying fields compositioin - 02-16-2004 , 01:00 PM



nickgreenwood716 (AT) aol (DOT) com (Nickgreenwood716) wrote in message news:<20040214071534.28419.00002152 (AT) mb-m15 (DOT) aol.com>...
Quote:
Hello Stan,

These sparse databases can be handled in a number of ways.
One way is to break the table into a 'star' arrangement. A central table
contains an any key information that each given row in your original table
would need to identify it, plus a unique identifier. All of the other fields
are stored in appropriate groups in other tables. These tables have an extra
column to store the unique identifier from the central table. Then to query the
data you do a join between the central table and those other tables containing
the contents that you are interested in. The resulting view looks like your
sparse table, but takes up less storage.
And taking the assumption that all the data IS part of ONE table, some
databases can deal with this quite nicely. ORACLE has some overhead
per field, but certainly doesn't allocate fixed length fields,
especially for character data. I'd assume SQL Server is similar. So
breaking a table to potentially save storage may be a bad choice. It
all depends on your DBMS.

But I'd suggest that you haven't really thought about you data enough.
Just because all the data about a customer is related to that customer
doesn't mean I put address, call records, or payments in the Customer
table. The fact that you have sparse data suggests to me that some of
the information comes from different sources than the rest which may
be one way to sort things out. I'm sure there are others. You need to
add a data modeller to your team. They would be able to help your
design. And a finaly hint. If you have at least some idea of the types
of queries you will perform on the data, that my help quide your
design also.

HTH,
ed


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.