dbTalk Databases Forums  

Is there open source library for Xtree ?

comp.databases comp.databases


Discuss Is there open source library for Xtree ? in the comp.databases forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
shna
 
Posts: n/a

Default Is there open source library for Xtree ? - 11-19-2006 , 09:29 AM






Hi all.

I need to implement some query operations on high-dimensional spatial data
(# dims is at least 10, and the number of objects is more than 100,000).

Among several spatial solutions, Xtree (Berchtold proposes) seems to be
reasonable in my problem.

Even though I think that there are many open sources for implementing Xtree,
I cannot find useful package.
(Many libraries seem to focus on very low dimensional data for specific
application such as GIS. )

One library is found in the wen site related to R-tree
(http://www.rtreeportal.org/code.html).
However, its implementation is not generalized so that my OS environment
cannot compile it.

Could you provide some information for X-tree library ?

Furthermore, how do you think for using X-tree in my problem ?
Is there more acceptable solution than X-tree ?

Thank you.

From Seung-Hoon Na.




Reply With Quote
  #2  
Old   
Neo
 
Posts: n/a

Default Re: Is there open source library for Xtree ? - 11-19-2006 , 02:11 PM







Quote:
I need to implement some query operations on high-dimensional spatial data (# dims is at least 10, and the number of objects is more than 100,000).
I am not familiar with spatial data or XTree, but why couldn't this be
implemented in a regular db with 100,000 records each having 10
attributes? Could you describe some sample data?



Reply With Quote
  #3  
Old   
shna
 
Posts: n/a

Default Re: Is there open source library for Xtree ? - 11-20-2006 , 05:23 AM




"Neo" <neo55592 (AT) hotmail (DOT) com> wrote

Quote:
I need to implement some query operations on high-dimensional spatial
data (# dims is at least 10, and the number of objects is more than
100,000).

I am not familiar with spatial data or XTree, but why couldn't this be
implemented in a regular db with 100,000 records each having 10
attributes? Could you describe some sample data?

Each dimension in object is numeric value.
Simply, data is extracted from multivariate Gaussian distribution.

We hope to fastly perform NN (Nearest neighbor) query operation for this
data. NN query is described as
* NN Query: For n dimensional point, find top n nearest neighbor objects.
where Euclidien distance for measuring distances between objects will be
used.

For spatial queries such as NN query, I know that DB society has developed
new data structure such as R-tree or X-tree.
In current my knowledge, R-tree is adequate for GIS (Geographic information
system) where few number of dimensions are used,
and, X-tree is more efficient for more high-dimensional data.

Is Regular DB support this spatial data?
I didnot check regular DB really support this type of data.









Reply With Quote
  #4  
Old   
Neo
 
Posts: n/a

Default Re: Is there open source library for Xtree ? - 11-20-2006 , 09:32 AM



Quote:
Each dimension in object is numeric value. We hope to fastly perform NN (Nearest neighbor) query operation for this data. NN query is described as * NN Query: For n dimensional point, find top n nearest neighbor objects.
Most likey, a regular db can model/query your data but the performance
may be unacceptable for your application. I still didn't understand
your application clearly. Is the following correct: There are approx
100,000 objects in space. You want to quickly determine the N closest
objects to a specified object.

Why not store raw data as follows:

Table1
Object x y z
--------- -------------
Obj1 3 4 9
Obj2 6 2 8
Obj3 2 4 7

Then create the following calculated data, sorted by columns ObjA and
Distance.

Table2
ObjA ObjB Distance Direction
____ ____ _______ ________
Obj1 Obj2 7 m
Obj1 Obj3 8 n
Obj2 Obj1 3 p
Obj2 Obj3 4 q
Obj3 Obj1 66 r
Obj3 Obj2 77 s

Then finding the N closest objects to any object should be relatively
fast/easy. Would this work in your case?



Reply With Quote
  #5  
Old   
shna
 
Posts: n/a

Default Re: Is there open source library for Xtree ? - 11-21-2006 , 05:15 AM




"Neo" <neo55592 (AT) hotmail (DOT) com> wrote

Quote:
Each dimension in object is numeric value. We hope to fastly perform NN
(Nearest neighbor) query operation for this data. NN query is described
as * NN Query: For n dimensional point, find top n nearest neighbor
objects.

Most likey, a regular db can model/query your data but the performance
may be unacceptable for your application. I still didn't understand
your application clearly. Is the following correct: There are approx
100,000 objects in space. You want to quickly determine the N closest
objects to a specified object.

Why not store raw data as follows:

Table1
Object x y z
--------- -------------
Obj1 3 4 9
Obj2 6 2 8
Obj3 2 4 7

Then create the following calculated data, sorted by columns ObjA and
Distance.

Table2
ObjA ObjB Distance Direction
____ ____ _______ ________
Obj1 Obj2 7 m
Obj1 Obj3 8 n
Obj2 Obj1 3 p
Obj2 Obj3 4 q
Obj3 Obj1 66 r
Obj3 Obj2 77 s

Then finding the N closest objects to any object should be relatively
fast/easy. Would this work in your case?

This pre-construction will be worked, but only when the number of data is
small.

However, in our case, we should deal with 100,000 objects. (or more than
100,000)
Then, the number of all possible pairs is 100,000^2 !. Thus, the
pre-construction is impossible due to very large complexity (store and
time).

Even though it is possible, for new query point of which is not same as ones
in any objects,
distances for ALL objects should be calculated. Without any index structure,
time complexity is at least O(n). (Sorting makes it O(n logn) when n is the
number of objects. )

Note that in one dimensional case, indexing using B-tree or B+tree can be
feasible data structure for this problem.
Even in two dimensional cases, such normal data structure can not be worked.




Reply With Quote
  #6  
Old   
Neo
 
Posts: n/a

Default Re: Is there open source library for Xtree ? - 11-21-2006 , 09:50 AM



Quote:
Then finding the N closest objects to any object should be relatively
fast/easy. Would this work in your case?

... deal with +100,000 objects. Then, the number of all possible pairs
is 100,000^2 !. Thus, the pre-construction is impossible due to
very large complexity (store and time).
In you application that will manage +100,000 objects, how much RAM and
hard disk space is available? Will this app run on a PC or specialized
computer? And how fast (ie 1 ms) do you need to calculate the nearest N
objects after a new object has been entered?

Quote:
... Note that in one dimensional case, indexing using B-tree or B+tree can be
feasible data structure for this problem.
Even in two dimensional cases, such normal data structure can not be worked.
Would a fuzzy method like the following work to find N closest objects
in space. Roughly, the method is to consider objects within a cube
space around an object. Increase/decrease cube space until it contains
slightly more than N objects. Then do final calculations to get N
closest objects.

Suppose we have three object lists. First is sorted by X (shown below),
second sorted by Y, third sorted by Z. New objects to be inserted in
appropriate location within each list.

Obj X
___ ___
obj1 3
obj2 5
obj3 100
obj4 178
obj5 245

To find N objects close to obj3. First pick a delta. Lets assume 10. In
first table, find objects whose X is within 90 and 110. In second
table, find objects within 10 units of obj3's Y. In third, table find
objects within 10 units of obj3's Z. Increase/decrease delta to get
slightly over N objects. Perform final calculation to determine the N
closest to obj3.



Reply With Quote
  #7  
Old   
shna
 
Posts: n/a

Default Re: Is there open source library for Xtree ? - 11-23-2006 , 10:34 AM




Quote:
Would a fuzzy method like the following work to find N closest objects
in space. Roughly, the method is to consider objects within a cube
space around an object. Increase/decrease cube space until it contains
slightly more than N objects. Then do final calculations to get N
closest objects.

Suppose we have three object lists. First is sorted by X (shown below),
second sorted by Y, third sorted by Z. New objects to be inserted in
appropriate location within each list.

Obj X
___ ___
obj1 3
obj2 5
obj3 100
obj4 178
obj5 245

To find N objects close to obj3. First pick a delta. Lets assume 10. In
first table, find objects whose X is within 90 and 110. In second
table, find objects within 10 units of obj3's Y. In third, table find
objects within 10 units of obj3's Z. Increase/decrease delta to get
slightly over N objects. Perform final calculation to determine the N
closest to obj3.

Thank you for your good help. Surely, this method can be a feasible
approximated solution even in large size of databases.
In fact, I tried to take this type of method in my problem, however, I
observed that the larger the number of dimensions, the larger delta should
be. That is, as the number of dimensions is just 10, approximation error
can be much increased.
One alternative approach is to perform dimensionality reduction such as
PCA(Principal component analysis), and then apply your method.
I cannot confirm whether this approach is worked well.
However, more importantly, the critical problem of this alternative one is
to perform PCA which is highly time-consuming process.
Maybe, PCA cannot be calulated in our database.







Reply With Quote
  #8  
Old   
Neo
 
Posts: n/a

Default Re: Is there open source library for Xtree ? - 11-23-2006 , 11:18 AM



The following topic in comp.object seems similiar to yours:

http://groups.google.com/group/comp....f3e3dcb11d1f9a


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.