dbTalk Databases Forums  

Application Question (Database vs FIlesystem)

comp.databases comp.databases


Discuss Application Question (Database vs FIlesystem) in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
André
 
Posts: n/a

Default Application Question (Database vs FIlesystem) - 03-11-2010 , 10:54 AM






Assuming I want to build a web based data repository to store, serve and
provide interaction for all different kinds of datasets, possibly
holding up to 1 million different datasets (timeseries, bulk data, etc.)
ranging from 1kB to 1MB in size. Would it be a good idea to do this
using a database or should only the meta data of those datasets be
stored using a database and the data itself simply by using files for
each dataset?

Any help would be appreciated.

Many thanks

Andre

Reply With Quote
  #2  
Old   
Ed Prochak
 
Posts: n/a

Default Re: Application Question (Database vs FIlesystem) - 03-12-2010 , 12:36 AM






On Mar 11, 10:54*am, André <j... (AT) hrz (DOT) tu-chemnitz.de> wrote:
Quote:
Assuming I want to build a web based data repository to store, serve and
provide interaction for all different kinds of datasets, possibly
holding up to 1 million different datasets (timeseries, bulk data, etc.)
ranging from 1kB to 1MB in size. Would it be a good idea to do this
using a database or should only the meta data of those datasets be
stored using a database and the data itself simply by using files for
each dataset?

Any help would be appreciated.

Many thanks

Andre
There was a time when I would definitely have said "put those large
objects in files". Nowadays, I do not think the answer is so clean
cut. For example, Oracle does better than it used to when storing
large data objects. So the answer really is to test it. Create
prototype systems, one using files the other storing in the DB, and
compare performance under conditions similar to your expected
application load. If there is no clear winner, go with what seems
easier for you to maintain in the long run.

One other point to consider if it is a tossup: how will backups be
performed. With DB only, there is just one data backup process. With
DB and files, you need two that MUST stay in sync.

HTH,
Ed

Reply With Quote
  #3  
Old   
Jasen Betts
 
Posts: n/a

Default Re: Application Question (Database vs FIlesystem) - 03-12-2010 , 03:35 AM



On 2010-03-11, André <joa (AT) hrz (DOT) tu-chemnitz.de> wrote:
Quote:
Assuming I want to build a web based data repository to store, serve and
provide interaction for all different kinds of datasets, possibly
holding up to 1 million different datasets (timeseries, bulk data, etc.)
ranging from 1kB to 1MB in size. Would it be a good idea to do this
using a database or should only the meta data of those datasets be
stored using a database and the data itself simply by using files for
each dataset?
1 MB is fairly small, I would put it in the database if it's not got
silly limits on size, it saves you from race conditions and data-file
synchronisation problems.

eg, if postgres.

CREATE TABLE datasets (
filename text PRIMARY KEY,
content bytea,
mtime timestamptz,
mime-type text);


--- news://freenews.netfront.net/ - complaints: news (AT) netfront (DOT) net ---

Reply With Quote
  #4  
Old   
Robert Klemme
 
Posts: n/a

Default Re: Application Question (Database vs FIlesystem) - 03-14-2010 , 04:34 AM



On 03/11/2010 04:54 PM, André wrote:
Quote:
Assuming I want to build a web based data repository to store, serve and
provide interaction for all different kinds of datasets, possibly
holding up to 1 million different datasets (timeseries, bulk data, etc.)
ranging from 1kB to 1MB in size. Would it be a good idea to do this
using a database or should only the meta data of those datasets be
stored using a database and the data itself simply by using files for
each dataset?
I agree to Ed and Jasen. I think from a maintenance and consistency
point it is much easier storing everything in the database. Especially
the backup topic can become complex and fragile quickly if you have
different storages that you need to synchronize. With Oracle (or any
other decent RDBMS) you just backup the database and are done.

However, I' would also consider storing data sets as they are structured
in the database and go from there. That way you could also fetch only
parts of sets reducing the overhead and IO bandwidth needed. If, on the
other hand, a dataset is something like a JPEG image, then storing it as
blob is more appropriate (although even then you might want to
additionally store some meta data in the table). It really depends on
the use case - although, rereading your message it seems your situation
is more like the latter.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.