dbTalk Databases Forums  

Whatīs the algorithm that compresses a 20 digit big int, into 8 bytes ?

comp.databases.theory comp.databases.theory


Discuss Whatīs the algorithm that compresses a 20 digit big int, into 8 bytes ? in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #21  
Old   
Rafael Anschau
 
Posts: n/a

Default Re: Whatīs the algorithm that compresses a 20 digit big int, into 8 bytes ? - 04-09-2010 , 01:08 PM






On Apr 9, 2:18*pm, Bob Badour <bbad... (AT) pei (DOT) sympatico.ca> wrote:

Quote:
The proof is trivial. 20 decimal digits have 10^20 distinct values. 64
binary digits have 2^64 distinct values.

10^20 > 2^64 > 10^19
Sorry Bob, I havenīt expressed myself right. The proof that
no *binary conversion* algorithm will ever do is trivial.

My question is really theoretical, how can I demonstrate from a given
set of accepted axioms, that no other algorithm(excluding binary
conversion) will ever do it.

Stuff like "convert and compress, compress by xamanic enlightenment,
compression by reiki" or anything else.

My understanding is that a proof on the subset of all binary
conversion algorithms
mathematically doesnīt transfer to the super set of all
algorithms.Even though
I intuitively know thatīs true.

But I understand this is way of topic, so I am off to comp.theory.

Regards and thanks again,

Rafael

Reply With Quote
  #22  
Old   
Bob Badour
 
Posts: n/a

Default Re: WhatÂīs the algorithm that compresses a 20 digit big int, into 8 bytes ? - 04-09-2010 , 02:05 PM






Rafael Anschau wrote:

Quote:
On Apr 9, 2:18 pm, Bob Badour <bbad... (AT) pei (DOT) sympatico.ca> wrote:

The proof is trivial. 20 decimal digits have 10^20 distinct values. 64
binary digits have 2^64 distinct values.

10^20 > 2^64 > 10^19

Sorry Bob, I havenÂīt expressed myself right. The proof that
no *binary conversion* algorithm will ever do is trivial.

My question is really theoretical, how can I demonstrate from a given
set of accepted axioms, that no other algorithm(excluding binary
conversion) will ever do it.

Stuff like "convert and compress, compress by xamanic enlightenment,
compression by reiki" or anything else.

My understanding is that a proof on the subset of all binary
conversion algorithms
mathematically doesnÂīt transfer to the super set of all
algorithms.Even though
I intuitively know thatÂīs true.

But I understand this is way of topic, so I am off to comp.theory.

Regards and thanks again,

Rafael
The proof above has nothing to do with binary conversion algorithms or
any other algorithms. It has to do with the counts of distinct values.

Reply With Quote
  #23  
Old   
Keith H Duggar
 
Posts: n/a

Default Re: Whatīs the algorithm that compresses a 20 digit big int, into 8 bytes ? - 04-09-2010 , 03:21 PM



On Apr 9, 12:57*pm, Rafael Anschau <rafael.ansc... (AT) gmail (DOT) com> wrote:
Quote:
On Apr 9, 12:44*pm, paul c <toledobythe... (AT) oohay (DOT) ac> wrote:

Open up your calculator and look at the result of 2 to the power of 63.
* I get 9,223,372,036,854,775,808. *Only nineteen digits.

Yes, it proves pure binary conversion wonīt do
it(2^64=18446744073709551616)
the 18% Sampo talked about.

But it doesnīt prove(demonstrates truth from a fixed set of given
axioms)
that no other algorithms will ever do it(although it serves as strong
evidence for that).

But thatīs the mathematician in me going off topic, I might debate
that
on alt.math or else.

Thanks anyway,
Let D = the set of 20-digit decimal numbers
Let B = the set of 64-digit binary numbers
Let F = be an encoding function from D to B

By definition for F to be an encoding it must be the case that

ForAll {x, y} . [x .E. D and y .E. D]
F(x) = F(y) implies x = y

therefore F is injective.

from the definition of cardinality :
Quote:
Y| >= |X| ie the cardinality of Y is greater than or equal
to the cardinality of X if-and-only-if there exists an
injective function (G : X -> Y) mapping X onto Y.

(1) Since F is an injective function D -> B then |B| >= |D|.

However, by counting permutations we have that

Quote:
D| = 10^20
B| = 2^64
and 10^20 > 2^64 therefore |D| > |B| which contradicts (1) above
therefore there cannot exist an injective function so F cannot
be injective. Therefore no encoding F from D to B exists.

You might also want to google the "pigeon hole principle" for
some interesting related discussions.

KHD

Reply With Quote
  #24  
Old   
Rafael Anschau
 
Posts: n/a

Default Re: Whatīs the algorithm that compresses a 20 digit big int, into 8 bytes ? - 04-09-2010 , 03:41 PM



On Apr 9, 5:21*pm, Keith H Duggar <dug... (AT) alum (DOT) mit.edu> wrote:

Quote:
You might also want to google the "pigeon hole principle" for
some interesting related discussions.
Hi Keith, I just read about it, and I am honored to have an answer
from a MIT student.

The insight I had was this(probably someone has refuted it already,
but itīs still in my mind).

Itīs possible to compress, using from the 256 possibilities of a byte
255 for values and one for concatenation. Assuming the concatenation
byte as 11111111


10101101011110001110101111000101101011000110000111 11111111111111111

becomes

10101101011110001110101111000101101011000110000(11 111111)(meaning
+)10100(20 in binary)*(assuming anything that comes after the byte of
frequency is the repeated digit)1.

Of course, that would require making numbers without the byte of
11111111 which would result in a new way for representing binary
numbers.But in this method, I donīt see any problems. It would also
mean repetition is frequent enough in this universe, which I am not
sure. It would also mean
that the frequency is high enough so all 20 digits numbers could be
placed there. Which seems very unlikely, so I accept it as proved
based on the unlikeliness of it all.

But getting back to earth, I guess databases just use 9 bytes for the
last 82% of the numbers that eventually need to be stored.

Thanks,

Reply With Quote
  #25  
Old   
Rafael Anschau
 
Posts: n/a

Default Re: Whatīs the algorithm that compresses a 20 digit big int, into 8 bytes ? - 04-09-2010 , 03:57 PM



On Apr 8, 10:57*am, Sampo Syreeni <de... (AT) iki (DOT) fi> wrote:

Quote:
Unless my trusty old TI-86 is losing precision in embarrassing places,
I should also add that 64 bits doesn't quite suffice for 20 decimal
digits; it has 18 per cent or so of the total range, so you'd have to
go with 19 digits or three extra bits.
Actually what databases do(just checked the manual, something I should
more often before engaging into discussions) is to hold only those
exactly 18%.

Itīs right there at:

http://dev.mysql.com/doc/refman/5.0/...ric-types.html

Reply With Quote
  #26  
Old   
Sampo Syreeni
 
Posts: n/a

Default Re: Whatīs the algorithm that compresses a 20 digit big int, into 8 bytes ? - 04-09-2010 , 08:17 PM



On Apr 9, 6:09*pm, Rafael Anschau <rafael.ansc... (AT) gmail (DOT) com> wrote:

Quote:
Any proof of that, or is it just another hypothesis ? My intuition
tells me this is true, but I would like to see a proof of that.
The simplest, most efficient proof would be to count to the maximum
number in each base before the nearmost forbidden digit switches from
0 to 1. Then compare. Do tack each of the individual numbers on paper,
so that you don't forget one of them. Also mail me your notes so that
I can doublecheck. There might be a publication there after all, if
the count doesn't come up right, so you'd be helped by some peer
review if that should happen.

More seriously, there are tons of proofs for this sort of thing. The
simplest fully general ones rely on the fact that the exponentiality
of any pure place notation translates into additivity in place under a
logarithm and behaves monotonously in between in any basis (even a
fractional one). That lets you compute relative amounts of digits and
the relative remainder by the division algorithm, evenwhile you're
really comparing ranges of number systems. Had I done it in that
domain, I could have said you lacked a fractional number of bits; and
in certain cases that would have made practical sense as well (cf.
e.g. arithmetic coding).

No, that's not a proof. It's a high level sketch to one and a pointer
to further math, which is all that you would get even from a real
mathematician. The point being, if you have to ask that sort of thing,
then a) you're avoiding your homework, so we're not going to help you
with that, b) you're heckling, so we're not going to entertain you, or
c) you really need help, in which case a high level hint helps you
help yourself *much* better than outright telling you the answer.

Then, how precisely was your question relevant to database theory? Do
you perhaps think your ongoing counts are going wrong because
predicate logic is somehow flawed, and that might then affect database
practice as well? Maybe you just wanted to probe us folks with an IQ
test to see if we're worthy of your attention. Oh, maybe I've
encountered my first real troll, reeling the wire in by the written
line. How quint.

Whatever the case, stop it and return to the subject of the group or
be killfiled: database theory. Preferably the relational kind.
--
Sampo

Reply With Quote
  #27  
Old   
Doug
 
Posts: n/a

Default Re: Whatīs the algorithm that compresses a 20 digit big int, into 8 bytes ? - 04-16-2010 , 02:43 PM



On Apr 9, 1:41*pm, Rafael Anschau <rafael.ansc... (AT) gmail (DOT) com> wrote:
Quote:
On Apr 9, 5:21*pm, Keith H Duggar <dug... (AT) alum (DOT) mit.edu> wrote:

You might also want to google the "pigeon hole principle" for
some interesting related discussions.

Hi Keith, I just read about it, and I am honored to have an answer
from a MIT student.

The insight I had was this(probably someone has refuted it already,
but itīs still in my mind).

Itīs possible to compress, using from the 256 possibilities of a byte
255 for values and one for concatenation. Assuming the concatenation
byte as 11111111

10101101011110001110101111000101101011000110000111 11111111111111111

becomes

10101101011110001110101111000101101011000110000(11 111111)(meaning
+)10100(20 in binary)*(assuming anything that comes after the byte of
frequency is the repeated digit)1.

Of course, that would require making numbers without the byte of
11111111 which would result in a new way for representing binary
numbers.But in this method, I donīt see any problems. It would also
mean repetition is frequent enough in this universe, which I am not
sure. It would also mean
that the frequency is high enough so all 20 digits numbers could be
placed there. Which seems very unlikely, so I accept it as proved
based on the unlikeliness of it all.

But getting back to earth, I guess databases just use 9 bytes for the
last 82% of the numbers that eventually need to be stored.

Thanks,
No possible compression could work if you have N possible input
values, but only M possible output values, with M < N. This is
obvious just by thinking about reversing the compression: with at most
M distinct values in the compressed space, you can have at most M
uncompressed values, meaning that some values from the original N
can't be represented.

Reply With Quote
  #28  
Old   
Rafael Anschau
 
Posts: n/a

Default Re: Whatīs the algorithm that compresses a 20 digit big int, into 8 bytes ? - 04-16-2010 , 03:11 PM



On Apr 16, 4:43*pm, Doug <Doug.McMa... (AT) oracle (DOT) com> wrote:

Quote:
No possible compression could work if you have N possible input
values, but only M possible output values, with M < N. *This is
obvious just by thinking about reversing the compression: with at most
M distinct values in the compressed space, you can have at most M
uncompressed values, meaning that some values from the original N
can't be represented.
True. Thanks Doug. I guess passion was above reason when I wrote that.

Rafael

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.