dbTalk Databases Forums  

Large Database System

comp.databases comp.databases


Discuss Large Database System in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
raidvvan@yahoo.com
 
Posts: n/a

Default Large Database System - 10-19-2007 , 10:01 AM






Hi there,

We have been looking for some time now for a database system that can
fit a large distributed computing project, but we haven't been able to
find one.
I was hoping that someone can point us in the right direction or give
us some advice.

Here is what we need. Mind you, these are ideal requirements so we do
not expect to find something that fits entirely into what we need
but we hope to get somewhat closer to that.

We need a database/file system:
1. built in C preferrably ANSI C, so that we can port it to Linux
Linux/Unix, Windows, Mac and various other platforms;
if it can work on Linux only then it is OK for now
2. that has a public domain or GPL/LGPL licence and source code access
3. uses hashing or b-trees or a similar structure
4. has support for files in the range of 1-10 GB; if it can get to 1
GB only, that should still be OK
5. can work with an unlimited number of files on a local machine; we
don't need access over a network, just local file access
6. that is fairly simple (i.e. library-style, key/data records); it
doesn't have to have SQL support of any kind; as long as we can add,
update, possibly delete data,
browse through the records and filter/query them it should be OK; no
other features are required, like backup, restore, users & security,
stored procedures...
7. reliable if possible
8 .local transactional support if possible; there is no need for
distributed transactions
9. fast data access if possible

We can not use any of the major commercial databases (e.g. Oracle, SQL
Server, DB2 or larger systems like Daytona...) obviously because of
licensing
and source code issues. We looked closer to MySQL, PostgreSQL but they
are too big and have way too many features that we do not need.
We need to be able to install a database/file system on possibly tens
of thousands of machines and we also expect it to work without
administration.
On top of that, we might end up with thousands of files of different
sizes on each machine. Are there any embedded (i.e. "lighter")
versions of these two databases?
We haven't been able to find anything like that. I am not sure how
much work would involve in "trimming" down some of these databases,
but that doesn't seem to be too easy to do.
Berkeley-DB would have been the best but is now under Oracle hands and
the licence has changed. TinyCDB was a close call, but the fact
that we need to rebuild the database for each data update is making it
unfeasible for large files (i.e. ~1Gb). SQL Lite is very interesting,
but it has many features that we don't need, like SQL support.

Right now we are using plain XML files so anything else would be a
great improvement.

Any suggestions or links to sites or papers or books would be welcome.
Any help would be greatly appreciated.

If this is not in the proper forum I appreciate if someone can move
the post to the right location or point us to the right one.

Thanks in advance.

Best regards,
Ovidiu Anghelidi
ovidiu (AT) intelligencerealm (DOT) com

Artificial Intelligence - Reverse Engineering The Brain


Reply With Quote
  #2  
Old   
Roy Hann
 
Posts: n/a

Default Re: Large Database System - 10-19-2007 , 10:11 AM






<raidvvan (AT) yahoo (DOT) com> wrote

Quote:
Hi there,

We have been looking for some time now for a database system that can
fit a large distributed computing project, but we haven't been able to
find one.
I was hoping that someone can point us in the right direction or give
us some advice.
[requirements snipped]

I would have suggested Ingres (www.ingres.com), but you do seem to want a
curious hybrid of a DBMS (transaction management) and a file access method,
so maybe Ingres is overkill in the same way that PostgreSQL is.

Roy




Reply With Quote
  #3  
Old   
michal.zaborowski@gmail.com
 
Posts: n/a

Default Re: Large Database System - 10-19-2007 , 02:34 PM



Give a try to SQLite. Note that - having too many features is much
better than lack of even one.
Paged files, simple and fast. If you do not need concurrency it will
be ok. It works well with bigger files.
There is something you have to know - data is stored as strings. If
you have a lot of data - it can be
problem.
PostgreSQL is good if you have central server. Also footprint is quite
big. Other option is FireBird.
Yes it is SQL server, but - it is something between SQLite and PG.
There is embedded version -
bigger than SQLite library - about 1.5 MB - AFAIR.

--
Regards,
Micha Zaborowski (TeXXaS)


Reply With Quote
  #4  
Old   
CRPence
 
Posts: n/a

Default Re: Large Database System - 10-31-2007 , 12:27 AM



I have no first-hand knowledge, but two links I have come across
recently that I figure might be of interest to meet some objectives.?

The Apache Derby project maybe?:
1> http://db.apache.org/
Apparently licensing is /compatible with/ GPL, and it is Java. I am
not clear on its relationship to Cloudscape.?
http://www.ibm.com/software/data/cloudscape/ Apparently "Cloudscape is
a commercial release of the Apache Software Foundation's (ASF) open
source Apache Derby relational database and is available at no charge."
per
http://www.ibm.com/developerworks/db...ine/index.html

Although not source nor available for Mac, the following link to the
DB2 Express-C suggests it is "Free to develop, deploy, distribute":
2> http://www.ibm.com/software/data/db2/express/
This product is apparently intended for those "considering or using
open source or other no-charge database servers" per:
http://www-128.ibm.com/developerwork...+Express-C+FAQ
Given current storage as XML, I infer the following might be of interest
as well; a tutorial that "explains how to handle XML documents natively
in the no-cost, open community DB2 Express-C and DB2 Developer Workbench":
http://www.ibm.com/developerworks/ed...&S_CMP=DEVXDTA

Regards, Chuck
--
All comments provided "as is" with no warranties of any kind
whatsoever and may not represent positions, strategies, nor views of my
employer

raidvvan (AT) yahoo (DOT) com wrote:
Quote:
We have been looking for some time now for a database system that can
fit a large distributed computing project, but we haven't been able to
find one.
I was hoping that someone can point us in the right direction or give
us some advice.

Here is what we need. Mind you, these are ideal requirements so we do
not expect to find something that fits entirely into what we need
but we hope to get somewhat closer to that.

We need a database/file system:
1. built in C preferably ANSI C, so that we can port it to Linux
Linux/Unix, Windows, Mac and various other platforms;
if it can work on Linux only then it is OK for now
2. that has a public domain or GPL/LGPL license and source code access
3. uses hashing or b-trees or a similar structure
4. has support for files in the range of 1-10 GB; if it can get to 1
GB only, that should still be OK
5. can work with an unlimited number of files on a local machine; we
don't need access over a network, just local file access
6. that is fairly simple (i.e. library-style, key/data records); it
doesn't have to have SQL support of any kind; as long as we can add,
update, possibly delete data,
browse through the records and filter/query them it should be OK; no
other features are required, like backup, restore, users & security,
stored procedures...
7. reliable if possible
8 .local transactional support if possible; there is no need for
distributed transactions
9. fast data access if possible

We can not use any of the major commercial databases (e.g. Oracle, SQL
Server, DB2 or larger systems like Daytona...) obviously because of
licensing and source code issues.
We looked closer to MySQL, PostgreSQL but they
are too big and have way too many features that we do not need.
We need to be able to install a database/file system on possibly tens
of thousands of machines and we also expect it to work without
administration.
On top of that, we might end up with thousands of files of different
sizes on each machine. Are there any embedded (i.e. "lighter")
versions of these two databases?
We haven't been able to find anything like that. I am not sure how
much work would involve in "trimming" down some of these databases,
but that doesn't seem to be too easy to do.
Berkeley-DB would have been the best but is now under Oracle hands and
the license has changed. TinyCDB was a close call, but the fact
that we need to rebuild the database for each data update is making it
unfeasible for large files (i.e. ~1Gb). SQL Lite is very interesting,
but it has many features that we don't need, like SQL support.

Right now we are using plain XML files so anything else would be a
great improvement.

Any suggestions or links to sites or papers or books would be welcome.
Any help would be greatly appreciated.


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.