dbTalk Databases Forums  

satellite data in an sql database : opinions?

comp.databases comp.databases


Discuss satellite data in an sql database : opinions? in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
jims
 
Posts: n/a

Default satellite data in an sql database : opinions? - 10-26-2006 , 11:43 AM






Hi.

I'm trying to figure out whether or not to convert a home-grown data
archiving and processing system in to one based on mySql.

Our data is basically grid data from satellites. There's lots of it.
Typically, we'd have an area of 200 x 200 points every hour of the day
where there's sunlight. So for a ten hour day we'd have

200 x 200 x sizeof(float) x 10 = 1.6Mb a day

We'd typically have several year's worth of data like this and in some
cases the grid size would be considerably larger. Also, it's quite
likely that we'll be storing several parameters, not just one so this
could scale up by an order of magnitude or two.

My interest in using mySql stems from a desire to get this data out of
yearly binary data files and into a large continuous database -- which
doesn't necessarily call for RDBMS functionalily -- but also because I
think the processing of the data could be simplified with selecting and
sorting fuctions being pushed off on the database side. We could do
some interesting stuff with the data afterwards if we had RDBMS
functionality..

The processing of the data takes place in two modes:

1.) the spatial domain, in which we consider just an hourly grid at a
time, usually checking pixel values against their neighbors. In this
case we'd be grabbing all pixels with a particular time stamp or
grid_id.

and

2.) the time domain, in which we process one point at a time, over the
entire data set. In this case we'd be keying off of a pixel_id of some
sort. That's all simple enough but in order to work through the entire
grid space, I'd be making, in the case given above, 1.6 million sql
calls through some DBI in order to process all the data. Actually, I'd
probably be making several times that when you consider that I'll
likely be writing data out as I go.

If anybody has some words of advice or caution, I'd be appreciative.
Performance is pretty important as one of the problems with the current
system is its sluggishness. Case 2 is obviously the big concern here.
I'm aware of the new(ish) spatial extension to the various sql systems,
including mySql, but I'm having trouble finding examples of its use
that are close to my situation. The current system works through all
the data on a time scale of days. I'm hoping to pare that down to
hours.

In summary, I think I'm in the position of making lots and lots of sql
calls on a fairly large dataset from my perl/C code and am wondering if
that's going to slow me down to a crawl. Having our data in sql would
have other advantages, such as ease of web-based distribution, but if
it's ultra-slow I'll have to come up with something else.

Thanks,

Jim Schlemmer


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.