dbTalk Databases Forums  

Entropy and Quantity of Information

comp.databases.theory comp.databases.theory


Discuss Entropy and Quantity of Information in the comp.databases.theory forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
David Cressey
 
Posts: n/a

Default Entropy and Quantity of Information - 01-11-2008 , 12:41 PM






Instead of hijacking another topic, I'll start this topic.

David BL suggested that a way to quantify information is the number of bits
needed to encode, allowing for compression. I said I preferred entropy as
the measure of information, and suggested that the two measures might in
some way be equivalent. Someone else recalled the concept of entropy from
the study of statistical mechanics and thermodynamics in physics.

The use of the word "entropy" definitiely comes from physics. The formulas
that Shannon (qv) determined for the amount of information in a message bear
an uncanny resemblance to the formulas for entropy in physics. I personally
don't believe that this is mere coincidence.

Next, I talked a little about entropy being context sensitive. There is a
much more precise way than mine of stating this, but I've forgotten what it
is. I think it goes something like this: the self information in a
message is context insensitive but the mutual information between a pair of
messages is dependent on both of them. My use of the word "database" in
place of message in my earlier comment is a little sloppy in this regard.

Next, the use of logarithms, and the number of bits needed to store
(represent?) a message are closely related. If we have 8 equally likely
messages, it will take precisely 3 bits to store one message. Note that
log base 2 of 8 is 3. If you use the formulas for entropy, and use 2 as
the base for the logarithms, the unit of information comes out to be
precisely the bit.

There is a form of compression called Huffman encoding on an ensemble of
messages that takes advantage of mathematical expectation to compress, on
the average, an ensemble of messages when the eight possible messages are
not a priori equally likely. If for example, message number 1 is 50%
likely, you and use a message consisting of 1 bit to represent it, and use
somewhat longer messages for the other 7 possible messages. I don't want to
belabor the point in cdt, but there are ample references for Huffamn
encoding on the web. If you study Huffman coding in detail, you'll start
to get logarithms into the picture.

All of this goes back to the 1960s, and some of it to the 1940s. Is entropy
still widely used in information science? Is it relevant to database
theory?






Reply With Quote
  #2  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: Entropy and Quantity of Information - 01-11-2008 , 01:40 PM






On Jan 11, 10:41*am, "David Cressey" <cresse... (AT) verizon (DOT) net> wrote:
Quote:
*Is entropy
still widely used in information science? *Is it relevant to database
theory?
http://www.sigmod.org/pods/proc00/papers/dalkilic.pdf



Reply With Quote
  #3  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: Entropy and Quantity of Information - 01-11-2008 , 01:40 PM



On Jan 11, 10:41*am, "David Cressey" <cresse... (AT) verizon (DOT) net> wrote:
Quote:
*Is entropy
still widely used in information science? *Is it relevant to database
theory?
http://www.sigmod.org/pods/proc00/papers/dalkilic.pdf



Reply With Quote
  #4  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: Entropy and Quantity of Information - 01-11-2008 , 01:40 PM



On Jan 11, 10:41*am, "David Cressey" <cresse... (AT) verizon (DOT) net> wrote:
Quote:
*Is entropy
still widely used in information science? *Is it relevant to database
theory?
http://www.sigmod.org/pods/proc00/papers/dalkilic.pdf



Reply With Quote
  #5  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: Entropy and Quantity of Information - 01-11-2008 , 01:40 PM



On Jan 11, 10:41*am, "David Cressey" <cresse... (AT) verizon (DOT) net> wrote:
Quote:
*Is entropy
still widely used in information science? *Is it relevant to database
theory?
http://www.sigmod.org/pods/proc00/papers/dalkilic.pdf



Reply With Quote
  #6  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: Entropy and Quantity of Information - 01-11-2008 , 01:40 PM



On Jan 11, 10:41*am, "David Cressey" <cresse... (AT) verizon (DOT) net> wrote:
Quote:
*Is entropy
still widely used in information science? *Is it relevant to database
theory?
http://www.sigmod.org/pods/proc00/papers/dalkilic.pdf



Reply With Quote
  #7  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: Entropy and Quantity of Information - 01-11-2008 , 01:40 PM



On Jan 11, 10:41*am, "David Cressey" <cresse... (AT) verizon (DOT) net> wrote:
Quote:
*Is entropy
still widely used in information science? *Is it relevant to database
theory?
http://www.sigmod.org/pods/proc00/papers/dalkilic.pdf



Reply With Quote
  #8  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: Entropy and Quantity of Information - 01-11-2008 , 01:40 PM



On Jan 11, 10:41*am, "David Cressey" <cresse... (AT) verizon (DOT) net> wrote:
Quote:
*Is entropy
still widely used in information science? *Is it relevant to database
theory?
http://www.sigmod.org/pods/proc00/papers/dalkilic.pdf



Reply With Quote
  #9  
Old   
Tegiri Nenashi
 
Posts: n/a

Default Re: Entropy and Quantity of Information - 01-11-2008 , 01:40 PM



On Jan 11, 10:41*am, "David Cressey" <cresse... (AT) verizon (DOT) net> wrote:
Quote:
*Is entropy
still widely used in information science? *Is it relevant to database
theory?
http://www.sigmod.org/pods/proc00/papers/dalkilic.pdf



Reply With Quote
  #10  
Old   
David Cressey
 
Posts: n/a

Default Re: Entropy and Quantity of Information - 01-12-2008 , 03:17 AM




"Joe Thurbon" <usenet (AT) thurbon (DOT) com> wrote

[snip good stuff]

Quote:
I vaguely recall that some of the more theoretical machine learning
results (like learnability and optimality results) rely on a notion
entropy.


I once wrote a little program to play Mastermind (a code guessing game)
using entropy to measure a move's worth. Mastermind can be played using a
brute force tree search. But my program tried out only about 10 moves, and
picked the one with the best entropy. Its play was only slightly inferior
to the play of a brute force tree searcher.





Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.