dbTalk Databases Forums  

Application Question (Postgresql vs FIlesystem)

comp.databases.postgresql comp.databases.postgresql


Discuss Application Question (Postgresql vs FIlesystem) in the comp.databases.postgresql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
André
 
Posts: n/a

Default Application Question (Postgresql vs FIlesystem) - 03-11-2010 , 10:55 AM






Assuming I want to build a web based data repository to store, serve and
provide interaction for all different kinds of datasets, possibly
holding up to 1 million different datasets (timeseries, bulk data, etc.)
ranging from 1kB to 1MB in size. Would it be a good idea to do this
using a Postgresql database or should only the meta data of those
datasets be stored using Postgresql and the data itself simply by using
files for each dataset?

Any help would be appreciated.

Many thanks

Andre

Reply With Quote
  #2  
Old   
Mladen Gogala
 
Posts: n/a

Default Re: Application Question (Postgresql vs FIlesystem) - 03-11-2010 , 02:28 PM






On Thu, 11 Mar 2010 16:55:17 +0100, André wrote:

Quote:
Assuming I want to build a web based data repository to store, serve and
provide interaction for all different kinds of datasets, possibly
holding up to 1 million different datasets (timeseries, bulk data, etc.)
ranging from 1kB to 1MB in size. Would it be a good idea to do this
using a Postgresql database or should only the meta data of those
datasets be stored using Postgresql and the data itself simply by using
files for each dataset?

Any help would be appreciated.

Many thanks

Andre

Andre, filesystems, in contrast to databases, were not designed to deal
with a gazillion of small files. You will encounter various issues with
the directories, speed of search and alike. If that was visible when
sharing a few thousand of MP3 files over Samba with my laptop (my LAN at
home), it would be even a bigger problem with sharing potentially
hundreds of thousands of files. So, my answer is to get it into the
database, especially if you have metadata like owner, date created, short
description, expiration date and alike.
There is an additional problem with file systems: there is no logging. If
the disk is gone, you can restore from the latest backup, but you will
never know whether you restored just about everything. At that point,
your database is probably out of sync with the file system and you will
have to write a syncing job, not a very easy thing to do with the CIO
staring at you while you're typing. If you put things into the database,
you have WAL and if you lose something, your DB will always be
consistent. Metadata is kept in the same record with the document, so you
will either lose the entire row or nothing. In either case, your DB is
consistent.
Last but not least, filesystem will waste a ton of storage. If your file
is smaller than 4k, the rest of space will be wasted. Linux file systems
are not really all that good. Fragmentation can be an issue, fsck takes
forever and they do not have any kind of optimizing I/O mechanism like z/
OS or Files11 which are capable of storing the information used together
on the neighboring tracks, based on usage. If you opt for the filesystem,
my advice would be to go with a commercial one, like VxFS.


--
http://mgogala.byethost5.com

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.