![]() | |
#11
| |||
| |||
|
|
On 18.10.2006 14:59, jpd wrote: But you save yourself the hassle of dealing with one additional component, namely the file system. |
|
If you want to keep that in sync, you're likely looking at some content management system, and for consistency you'd probably have to make that self-referential as well. So adding a couple of checks for url columns referencing datafiles somewhere in the system then isn't that much of a problem. I'll admit that putting literally everything in a database has its elegance, but the model itself only holds up to a certain point and might not be all that practical beyond that. You did mention backups, and a backup solution also needs to back it all up together. What point exactly do you mean? |
|
From my experience file systems don't do very well with large amounts of files in a single directory. If you start distributing files across several directories you'll be better off by just using a database's built in indexing which is exactly built for that. |
#12
| ||||
| ||||
|
|
Begin <4pmq3gFjghafU1 (AT) individual (DOT) net On 2006-10-18, Robert Klemme <shortcutter (AT) googlemail (DOT) com> wrote: On 18.10.2006 14:59, jpd wrote: But you save yourself the hassle of dealing with one additional component, namely the file system. Except that in this scenario you need it anyway, and it is in fact just about the simplest access method of the range available, and optimised for the task to boot. This isn't always true, of course, but that wasn't the point. The point was that there are situations where it is true, and that I prefer to organise systems in such a way to be able to exploit the simplest mechanism available. |
|
problem. I'll admit that putting literally everything in a database has its elegance, but the model itself only holds up to a certain point and might not be all that practical beyond that. You did mention backups, and a backup solution also needs to back it all up together. What point exactly do you mean? I don't know where the point lies, exactly. I daresay it isn't really important to locate it very precisely, either. :-) |
|
[snip] From my experience file systems don't do very well with large amounts of files in a single directory. If you start distributing files across several directories you'll be better off by just using a database's built in indexing which is exactly built for that. There are several file systems that do deal well with many files, eg FreeBSD's options DIRHASH. |
|
Also, one could easily store files in the filesystem but use the database to locate the files (report an URL as stored in a table, treating it as being opaque), so for normal retrievals no traversing of directories is necessairy and multiple directory levels can be used without added complexity to the fastpath. |
#13
| ||||
| ||||
|
|
On 18.10.2006 19:33, jpd wrote: Begin <4pmq3gFjghafU1 (AT) individual (DOT) net On 2006-10-18, Robert Klemme <shortcutter (AT) googlemail (DOT) com> wrote: On 18.10.2006 14:59, jpd wrote: Although I tend to agree my experience tells me that the simplest approach often does break at some point. Either it is during the first project because when implementing starts some forgotten aspects (requirements, limitations...) show up or it is later because throughout the lifetime additional requirements must be fulfilled. All this tells me to apply some caution when choosing approaches and not jump on the first or simplest solution that comes to mind. :-) |
|
I'll admit that putting literally everything in a database has its elegance, but the model itself only holds up to a certain point and might not be all that practical beyond that. You did mention backups, and a backup solution also needs to back it all up together. What point exactly do you mean? I don't know where the point lies, exactly. I daresay it isn't really important to locate it very precisely, either. :-) Um, I was not so much after a concrete figure but rather what scale you were talking about. Is it number of items / files? Is it volume? Is it application complexity? Is it something else? |
|
There are several file systems that do deal well with many files, eg FreeBSD's options DIRHASH. Yeah, I have heard that the development of file systems goes into that direction. I not sure how ubiquitous those are yet. |
|
Also, one could easily store files in the filesystem but use the database to locate the files (report an URL as stored in a table, treating it as being opaque), so for normal retrievals no traversing of directories is necessairy and multiple directory levels can be used without added complexity to the fastpath. Yes, but you nevertheless need the logic to distribute files across directories; if you think a bit about this you will likely start implementing some tree mechanisms including attempting to equally fill directories etc. That's exactly what DB's can do very well as I tried to convey in my last posting (probably not clear enough). |
#14
| ||||
| ||||
|
|
Begin <4povhaFjjd2nU1 (AT) individual (DOT) net On 2006-10-19, Robert Klemme <shortcutter (AT) googlemail (DOT) com> wrote: On 18.10.2006 19:33, jpd wrote: Begin <4pmq3gFjghafU1 (AT) individual (DOT) net On 2006-10-18, Robert Klemme <shortcutter (AT) googlemail (DOT) com> wrote: On 18.10.2006 14:59, jpd wrote: Although I tend to agree my experience tells me that the simplest approach often does break at some point. Either it is during the first project because when implementing starts some forgotten aspects (requirements, limitations...) show up or it is later because throughout the lifetime additional requirements must be fulfilled. All this tells me to apply some caution when choosing approaches and not jump on the first or simplest solution that comes to mind. :-) True enough. The more complex the approach the more opportunity for breakage. Sometimes one does need complexity to deal with an inherently complex situation, but that is always a tradeoff against the extra risk it comes with. |
|
I'll admit that putting literally everything in a database has its elegance, but the model itself only holds up to a certain point and might not be all that practical beyond that. You did mention backups, and a backup solution also needs to back it all up together. What point exactly do you mean? I don't know where the point lies, exactly. I daresay it isn't really important to locate it very precisely, either. :-) Um, I was not so much after a concrete figure but rather what scale you were talking about. Is it number of items / files? Is it volume? Is it application complexity? Is it something else? |
|
Where databases are made to manage tuples of small bits of data really well, there is no sense in storing entire movies as the database isn't able to do anything useful with that data anyway. You'd be much better off extracting the metadata, then storing that along with a pointer to the bulk of the data at a location elsewhere[1]. |
|
So, in short, databases are tools that like all good tools are really useful for what they do, but that doesn't mean they're useful for everything else, too. |
#15
| |||
| |||
|
|
So basically your scale is application complexity if I read you correctly. |
|
- ensure data consistency via transactions |
|
- optimize storage and accessibility through its internal caching and distribution mechanisms |
#16
| |||
| |||
|
|
[snippage] |
#17
| |||
| |||
|
|
We currently store images as binary blobs in SQL Server. The difference in speed between fetching from the database and fetching from the filesystem (as our file explorer window does) is very noticable. After looking into many of the available alternatives, I'm starting to think that storing binaries like this in the database isn't so clever. Although we are getting transactional integrity and an easy backup/restore mechanism (one database file), we can overcome many limitations by storing a URL/moniker and putting the images on the file system, including the ability to add movie files too. I found an interesting blog post on what is actually going on in the average system when you manipulate image blobs: http://mysqldump.azundris.com/archiv...-Database.html I found it insightful. My only two concerns with the file system approach are (1) ensuring integrity between the file system and database and (2) backing up and restoring. |
![]() |
| Thread Tools | |
| Display Modes | |
| |