dbTalk Databases Forums  

BLOB storage options

microsoft.public.sqlserver.server microsoft.public.sqlserver.server


Discuss BLOB storage options in the microsoft.public.sqlserver.server forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Bill E. Hollywood, FL
 
Posts: n/a

Default BLOB storage options - 09-26-2011 , 06:14 AM






I will be creating a web application where upload and modification of
documents will be an important part of the application. Specifically,
users will upload documents (typically .pdf or .doc/docx). These will
be modified by other users and saved as different "versions" of the
same document. There will also be a great deal of other information
both related and unrelated to documents.

I am considering options for storing the documents both inside and
outside of SQL Server. I am looking at the following options:

1. Store files in the server file system and include a path or partial
path to the file in a varchar type column.
2. Store files in the database in a table that includes a
varbinary(max) or other acceptable column type. Options here include
using filestream as well as using a separate filegroup for BLOB
storage.
3. Using a separate database on the same SQL Server for the BLOBs.
4. Other options?

I have experience with 1 and 2 and I understand some of the benefits
and drawbacks of each. However, I have another requirement--that I be
able to backup and restore the database WITHOUT the BLOBs and still
have a working database that I can use for development. Option 1
gives me this but I'm not convinced that I can easily do that with
option 2 even if BLOBs are in a separate filegroup and I do filegroup
restores (please correct me if I'm wrong).

Option 3 would give me the flexibility I need but I would still need
to create numerous triggers to simulate the foreign key constraints
that I would naturally create between the tables in a single
database. I'm not sure that this is a big deal, however. It does
mean that references to my BLOB database would be hardcoded into my
triggers and procedures so I might need to use different SQL instances
for test and production environments.

Does anyone have experience with option 3? Is it a good option and if
so, how did you handle it? Are there other approaches that I should
consider?

Thanks,

Bill E.
Hollywood, FL

Reply With Quote
  #2  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: BLOB storage options - 09-26-2011 , 02:45 PM






Bill E. Hollywood, FL (behrreich (AT) compuserve (DOT) com) writes:
Quote:
1. Store files in the server file system and include a path or partial
path to the file in a varchar type column.
2. Store files in the database in a table that includes a
varbinary(max) or other acceptable column type. Options here include
using filestream as well as using a separate filegroup for BLOB
storage.
3. Using a separate database on the same SQL Server for the BLOBs.
4. Other options?

I have experience with 1 and 2 and I understand some of the benefits
and drawbacks of each. However, I have another requirement--that I be
able to backup and restore the database WITHOUT the BLOBs and still
have a working database that I can use for development. Option 1
gives me this but I'm not convinced that I can easily do that with
option 2 even if BLOBs are in a separate filegroup and I do filegroup
restores (please correct me if I'm wrong).
I certainly like option 3 better than option 1. BACKUP/RESTORE on two
databases is easier than having to restore a database and a file system.
And with option 3, you could build the database on views, so it looks
that the data is in one table, permitting you to merge the databases
later on if you wish.

As for the second option, I don't what happens if you restore the database
without the BLOB filegroup. In the best of worlds, you should be able
to access the rest of the data, although I don't know whether it actually
works that way. So much is clear, if you try to access the BLOB data,
you will lose.

Then again, if you want database without the blobs for development,
what will you have instead of the blobs? Just have a fixed set of documents
and only work with these for your test cases?

How big do you expect the production database to be?


--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx

Reply With Quote
  #3  
Old   
Bill E. Hollywood, FL
 
Posts: n/a

Default Re: BLOB storage options - 09-27-2011 , 09:50 AM



Erland,

Thanks for your response.

I'm not sure how large the database will be for just data, but I am
expecting three or four thousand document uploads per month once
things get going. Therefore, the file storage and retrieval component
will not be trivial. It's hard to say how large the documents might
be because they could be in many formats. Let's assume an average of
500 kB per document and about 4,000 documents per month -- that gives
us 2GB per month of just document storage.

I have an existing database from a different application for the same
client and it is around 1 GB in size after many years of operation.
This isn't necessarily a good indicator for this new database but I
would still feel comfortable saying that the file storage will be much
larger than the other data storage.

I have sketched out a plan where I am storing the normal file
parameters in a table inside the main database but instead of a file
path or file binary column, I have an identifier that "relates" to an
identifier in the "file" database as below:

--"Main" database
CREATE TABLE dbo.Files
(
File_ID int IDENTITY(1,1) NOT NULL,
Request_ID int NOT NULL,
Description varchar(250) NULL,
Date_Created datetime NOT NULL,
User_ID int NOT NULL,
File_Type varchar(250) NULL,
File_Size int NULL,
CONSTRAINT Files_PK PRIMARY KEY CLUSTERED
(
File_ID ASC
),
--Other constraints here
)

--"File" database
CREATE TABLE dbo.File_Data
(
File_ID int NOT NULL,
File_Binary varbinary(max) NULL,
CONSTRAINT File_Data_PK PRIMARY KEY CLUSTERED
(
File_ID ASC
)
)

Triggers would be created to enforce a relationship between the two
tables on column File_ID such that rows are always inserted/deleted
together.

In development mode I would have a File database with an empty or
mostly empty File_Data table. When viewing a file, if there is no
matching row in File_Data I would return a NULL for File_Binary. Even
though the file will not be returned and thus not viewable, the
procedures would still work and searches could still be done on
existing files.

This is a first approach so critiques would be welcomed.

Bill

Reply With Quote
  #4  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: BLOB storage options - 09-27-2011 , 03:56 PM



Bill E. Hollywood, FL (behrreich (AT) compuserve (DOT) com) writes:
Quote:
I'm not sure how large the database will be for just data, but I am
expecting three or four thousand document uploads per month once
things get going. Therefore, the file storage and retrieval component
will not be trivial. It's hard to say how large the documents might
be because they could be in many formats. Let's assume an average of
500 kB per document and about 4,000 documents per month -- that gives
us 2GB per month of just document storage.
500 kB appears a tad high to me, but that is just gut feeling.

But even if we accept this number - after three years the database
will be 72 GB - provided that documents are kept "forever" and not
archived anywere.

To be honest, I am not sure that this warrants a more complex solution
to make it easier to have a development database, unless are other reasons
you want a document-less database for development. True, restoring
72 GB takes longer time than restoring 1GB, but it's not an operation
I expect you do very often.

--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx

Reply With Quote
  #5  
Old   
Bill E. Hollywood, FL
 
Posts: n/a

Default Re: BLOB storage options - 09-27-2011 , 05:01 PM



Thanks, Erland.

These databases will live on a remote 3rd party virtual server and
current connection speeds are not great (sometimes as low as 1 to 2
Mbps) so this is the reason. Yes, restores won't occur very often but
when they do it will be a pain. I came up with the 500 kb file size
by averaging the size of a few PDF reports I had on my hard drive.

Bill

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.