dbTalk Databases Forums  

"Dimensioned" numeric type

comp.databases.postgresql comp.databases.postgresql


Discuss "Dimensioned" numeric type in the comp.databases.postgresql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
D Yuniskis
 
Posts: n/a

Default "Dimensioned" numeric type - 03-01-2011 , 11:25 AM






Hi,

There are a bunch of "issues" here but I will try to
reduce my question to the most important one(s)...
[I sincerely hope I haven't asked this before]

Almost *every* numeric datum in my tables is a
dimensioned quantity. E.g., "3 inches", "6 pounds",
"14.7 lumens", etc.

A column's "dimension" is consistent for all values
within that column. E.g., a column may have the
dimension of "length". So, "3 feet", "1 yard",
"2 meters", "83.5 parsecs" would all be valid values
for such a column.

[note that "3 feet" is different from "1 yard" -- even
though they both might *resolve* to the same physical
length]

The first "problem" is to define a mechanism whereby I
can enforce a particular "dimensioned type" on a column.
So, I can define a column as "type length" and, thereafter,
be assured that PG (with my mechanisms) enforces this
constraint on all values entered into that column.

I can come up with a representation that allows me to
accurately preserve values as entered (e.g., noting that
"3.0 feet", "36 inches", "1 yard" and "3 feet" are each
*different* and need to be recalled as such) yet gives
me some expediency in comparison operations (e.g., for
sorting).

[I didn't claim this was going to be *efficient*! :> ]

So, for example, I can store the "input string", as entered,
to preserve the subtleties in the original representation
(e.g., "3 ft" vs. "3.0000 ft" vs. "1 yd" vs. "2 ft 12 in"!).

I can come up with a "sorting value" for the cardinality of
the value represented (in bogounits).

I can parse the string (prior to "blessing it") to verify the
dimensions within the string are consistent among themselves
(e.g., "2 ft 12 in" is OK but "2 ft 3 oz" is not).

I can codify the overall "dimension" of the value (e.g., "length"
vs. "angle" vs. "length / time").

And, of course, I can compare this with the "defined type" of
the column to verify that the value entered is appropriate
for this column.

But, where should I *keep* that "definition"? I.e., it does
not vary on a row by row basis so it doesn't deserve to be part
of the individual data. And, it is "defined" before any
*values* are inserted into that column (i.e., it's definition
exists even in the absence of data).

I suspect twiddling the system tables is frowned upon?
But, I could always create my *own* version of the system
tables layered atop everything??

(sigh) Too early in the morning for this sort of thinking.
My head hurts... :-(

Thanks!
--don

Reply With Quote
  #2  
Old   
Jasen Betts
 
Posts: n/a

Default Re: "Dimensioned" numeric type - 03-09-2011 , 04:13 AM






On 2011-03-01, D Yuniskis <not.going.to.be (AT) seen (DOT) com> wrote:
Quote:
[note that "3 feet" is different from "1 yard" -- even
though they both might *resolve* to the same physical
length]

The first "problem" is to define a mechanism whereby I
can enforce a particular "dimensioned type" on a column.
So, I can define a column as "type length" and, thereafter,
be assured that PG (with my mechanisms) enforces this
constraint on all values entered into that column.
1 way: check the value against the apropriate regular expression.

another way: attempt to use 'units' or a similar program to convert
the offered value into a chosen base type (eg: for length convert it to
metres)

Quote:
I can come up with a representation that allows me to
accurately preserve values as entered (e.g., noting that
"3.0 feet", "36 inches", "1 yard" and "3 feet" are each
*different* and need to be recalled as such) yet gives
me some expediency in comparison operations (e.g., for
sorting).
something like 'units'? or design your own if you don't need obscure
units like 'thomb'

$ units 'three inch' mm
* 76.2

A c function (or several) using libudunits might be the way
to go. I have never used libudunits, 'units' does not use it either.

Quote:
I suspect twiddling the system tables is frowned upon?
But, I could always create my *own* version of the system
tables layered atop everything??

(sigh) Too early in the morning for this sort of thinking.
My head hurts... :-(
just create several new type of text and new comparison operators
that convert to a base type before comparing.

absolute temperature will need a to be different type to relative temperature.

units takes " as seconds of arc rather than as inches, but it may be
possible to force libudunits to do what you want.

--
⚂⚃ 100% natural

Reply With Quote
  #3  
Old   
D Yuniskis
 
Posts: n/a

Default Re: "Dimensioned" numeric type - 03-09-2011 , 06:59 PM



Hi Jasen,

On 3/9/2011 3:13 AM, Jasen Betts wrote:
Quote:
On 2011-03-01, D Yuniskis<not.going.to.be (AT) seen (DOT) com> wrote:

[note that "3 feet" is different from "1 yard" -- even
though they both might *resolve* to the same physical
length]

The first "problem" is to define a mechanism whereby I
can enforce a particular "dimensioned type" on a column.
So, I can define a column as "type length" and, thereafter,
be assured that PG (with my mechanisms) enforces this
constraint on all values entered into that column.

1 way: check the value against the apropriate regular expression.
That's not trivial. The grammar is something like:

(<value>(/<value>)? <unit>)+

But valid <unit> are defined, ultimately, in a table within
PG. Parsing an expression can require resolving a <unit>
to determine its compatibility with other <unit>s in the
same "expression".

As a (silly) example, the following is a valid entry:

5.2 g 17/3 ft / sec / sec 2 m / min / sec

since all units resolve to "length / (time * time)".

[I have no idea why anyone would opt to specify a
quantity this bizarrely but it makes as much sense
as "3 ft 2 in"]

Quote:
another way: attempt to use 'units' or a similar program to convert
the offered value into a chosen base type (eg: for length convert it to
metres)
I don't resolve an expression to a "real" value because
this throws away information. E.g., "3" and "3.000" are
different as are "12 inches" and "1 foot" (neglecting the
ambiguity in determining *which* "foot" is meant, here).

I thought I could possibly store an *approximation* of
a "real value" as an *interval* (i.e., (min,max) range)
and use that to expedite comparisons. So, the compare
operator that I would have to write for the type could
do something like:

if (a.min > b.max)
return A_GREATER_THAN_B
else if (a.max < b.min)
return A_LESS_THAN_B
else
/* ranges overlap so more expensive test required */

and *hope* that the early tests handle most comparisons.

Quote:
I can come up with a representation that allows me to
accurately preserve values as entered (e.g., noting that
"3.0 feet", "36 inches", "1 yard" and "3 feet" are each
*different* and need to be recalled as such) yet gives
me some expediency in comparison operations (e.g., for
sorting).

something like 'units'? or design your own if you don't need obscure
units like 'thomb'

$ units 'three inch' mm
* 76.2

A c function (or several) using libudunits might be the way
to go. I have never used libudunits, 'units' does not use it either.
These all force binding the expression to a real value
before storing it. See above

Quote:
I suspect twiddling the system tables is frowned upon?
But, I could always create my *own* version of the system
tables layered atop everything??

(sigh) Too early in the morning for this sort of thinking.
My head hurts... :-(

just create several new type of text and new comparison operators
that convert to a base type before comparing.
If I restrict the type to just a special "text", then I
lose the possibility of any optimizations for comparisons, etc.
(because there are no other "members" in the type that I
can stuff information into -- like "(min, max)" alluded to
above)

Quote:
absolute temperature will need a to be different type to relative temperature.
Yes. I'm not worried about handling the unit conversions, etc.
Rather, trying to let PG enforce what it *should* enforce
("this column has type 'temperature'; this other column
has type 'length / time'; etc.") in as efficient a way as
possible.

Quote:
units takes " as seconds of arc rather than as inches, but it may be
possible to force libudunits to do what you want.

Reply With Quote
  #4  
Old   
Jasen Betts
 
Posts: n/a

Default Re: "Dimensioned" numeric type - 03-10-2011 , 03:13 AM



On 2011-03-10, D Yuniskis <not.going.to.be (AT) seen (DOT) com> wrote:
Quote:
Hi Jasen,


The first "problem" is to define a mechanism whereby I
can enforce a particular "dimensioned type" on a column.
So, I can define a column as "type length" and, thereafter,
be assured that PG (with my mechanisms) enforces this
constraint on all values entered into that column.

another way: attempt to use 'units' or a similar program to convert
the offered value into a chosen base type (eg: for length convert it to
metres)

I don't resolve an expression to a "real" value because
this throws away information. E.g., "3" and "3.000" are
different as are "12 inches" and "1 foot" (neglecting the
ambiguity in determining *which* "foot" is meant, here).
what I mean is attempt the conversion, and if that's successful the
expression is valid



Quote:
I thought I could possibly store an *approximation* of
a "real value" as an *interval* (i.e., (min,max) range)
and use that to expedite comparisons. So, the compare
operator that I would have to write for the type could
do something like:
floating-point types are by definition approximations.

Quote:
if (a.min > b.max)
return A_GREATER_THAN_B
else if (a.max < b.min)
return A_LESS_THAN_B
else
/* ranges overlap so more expensive test required */

and *hope* that the early tests handle most comparisons.
where are you getting min and max from ?

Quote:
I can come up with a representation that allows me to
accurately preserve values as entered (e.g., noting that
"3.0 feet", "36 inches", "1 yard" and "3 feet" are each
*different* and need to be recalled as such) yet gives
me some expediency in comparison operations (e.g., for
sorting).

something like 'units'? or design your own if you don't need obscure
units like 'thomb'

$ units 'three inch' mm
* 76.2

A c function (or several) using libudunits might be the way
to go. I have never used libudunits, 'units' does not use it either.

These all force binding the expression to a real value
before storing it. See above
no they don't. store the string, use units for the comparison.

$ units "6 miles - 10 kilometres"
Definition: -343.936 m

negative result means '10 kilometres' > '6 miles'

Quote:
just create several new type of text and new comparison operators
that convert to a base type before comparing.

If I restrict the type to just a special "text", then I
lose the possibility of any optimizations for comparisons, etc.
(because there are no other "members" in the type that I
can stuff information into -- like "(min, max)" alluded to
above)
Why do you need "(min, max)"? the precision of a double precision float
exceeds the capability of all common measuring aparatus.

--
⚂⚃ 100% natural

Reply With Quote
  #5  
Old   
D Yuniskis
 
Posts: n/a

Default Re: "Dimensioned" numeric type - 03-10-2011 , 09:46 AM



On 3/10/2011 2:13 AM, Jasen Betts wrote:
Quote:
On 2011-03-10, D Yuniskis<not.going.to.be (AT) seen (DOT) com> wrote:
Hi Jasen,


The first "problem" is to define a mechanism whereby I
can enforce a particular "dimensioned type" on a column.
So, I can define a column as "type length" and, thereafter,
be assured that PG (with my mechanisms) enforces this
constraint on all values entered into that column.

another way: attempt to use 'units' or a similar program to convert
the offered value into a chosen base type (eg: for length convert it to
metres)

I don't resolve an expression to a "real" value because
this throws away information. E.g., "3" and "3.000" are
different as are "12 inches" and "1 foot" (neglecting the
ambiguity in determining *which* "foot" is meant, here).

what I mean is attempt the conversion, and if that's successful the
expression is valid
Ah, OK.

Quote:
I thought I could possibly store an *approximation* of
a "real value" as an *interval* (i.e., (min,max) range)
and use that to expedite comparisons. So, the compare
operator that I would have to write for the type could
do something like:

floating-point types are by definition approximations.
Yes. min+1 ULP = max.

The point is, to not look for "true" comparisons unless you
are forced to do so (because the stored "approximation" is
ambiguous)

Quote:
if (a.min> b.max)
return A_GREATER_THAN_B
else if (a.max< b.min)
return A_LESS_THAN_B
else
/* ranges overlap so more expensive test required */

and *hope* that the early tests handle most comparisons.

where are you getting min and max from ?
When the expression is initially parsed and it's representation
created, determine an interval (min,max) that approximates the
value FOR COMPARISONS.

Quote:
I can come up with a representation that allows me to
accurately preserve values as entered (e.g., noting that
"3.0 feet", "36 inches", "1 yard" and "3 feet" are each
*different* and need to be recalled as such) yet gives
me some expediency in comparison operations (e.g., for
sorting).

something like 'units'? or design your own if you don't need obscure
units like 'thomb'

$ units 'three inch' mm
* 76.2

A c function (or several) using libudunits might be the way
to go. I have never used libudunits, 'units' does not use it either.

These all force binding the expression to a real value
before storing it. See above

no they don't. store the string, use units for the comparison.

$ units "6 miles - 10 kilometres"
Definition: -343.936 m

negative result means '10 kilometres'> '6 miles'
I only want to do that for cases where the "stored approxmation"
is "ambiguous" -- it is too expensive to be doing *often* (and
I am gambling that the approximation might be "good enough"
for most compares)

Another approach is to store a sort index on each INSERT or UPDATE
assuming values are retrieved more often than altered/added.

Quote:
just create several new type of text and new comparison operators
that convert to a base type before comparing.

If I restrict the type to just a special "text", then I
lose the possibility of any optimizations for comparisons, etc.
(because there are no other "members" in the type that I
can stuff information into -- like "(min, max)" alluded to
above)

Why do you need "(min, max)"? the precision of a double precision float
exceeds the capability of all common measuring aparatus.
Because expressions can express things that can't be
"measured"! :> Why not *legislate* pi to be equal to
"(double) pi"?

--don

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.