![]() | |
#101
| |||
| |||
|
|
On Wed, 27 Apr 2011 13:34:31 +0100, Tim Watts wrote: Peter H. Coffin wrote: On Wed, 27 Apr 2011 08:24:16 +0100, The Natural Philosopher wrote: Norman Peelman wrote: Doug Miller wrote: In article <ip57sl$rc5$1 (AT) dont-email (DOT) me>, Norman Peelman 460 * 5.3kb You wrote above "460 images ... with an average size of 5393kb" . 5393kb is 5.4 MEGAbytes, not 5.3kb. Yes, my fingers were going faster than my brain. Average of 5393 bytes (5kb) Max of 10240 bytes (10kb) ...these are small images. Dump file (w/images) = 29.8MB Dump file zipped = 1.7MB It is seldom possible to compress images more than they are already compressed. So I still think you have made a mistake. Dump files can (intentionally and with malice aforethought) export binary columns in hexidecimal text, which is rather compressible. It's also very safe from things like people fussing with it with text editors, being copied and pasted into emails for demonstrative purposes, and other kinds of mistreatment. I think the point that is being overlooked, is that the text, whilst compressible, is itself a re-encoding of an already highly compressed bit of data. Some parts of the file are highly-compressed data. Well, fairly-highly-compressed, anyway. The actual compression used in, for example JFIF/.jpeg is Huffman. More of the initial 'compression' in those comes from not actual compression but rather lossy tricks to make an image that looks about the same as the original to the eye, but isn't itself 'compression' in the sense that the data inside itself isn't necessarily further incompressible. What this means is that fairly small images don't have a lot of "compressed data" in them in the first place, and the overhead for graphics with small fields of image data might be easily half overhead. if anything, the intermidate text encoding should make things worse overall, not better. One would think so at first glance, but text is really easy to compress. We are still talking about 2.2MB compressed image data mixed up with other stuff in an exploded ASCII form being, somehow, recompressed down to a total which is less than the sum of the original images alone. That's the key to what I think is happening here. See, one image may be compressible for some small gains. But many images, especially with very similar information in the overhead portions of the formats, like they're mostly all the same sizes, or use similar color pallets, end up being compressable by being able to compress duplicate information *between* the images as well as within the image itself. It would make me want to double check the dumps to see they really had everything... Always a worthwhile step. But if the dump restores okay, the size alone isn't necessarily a warning that something else is wrong. |
#102
| |||||||||
| |||||||||
|
|
"Robert Crandal" <rcranz143101 (AT) gmail (DOT) com> wrote: What is image metadata? The extra information that's neither in the file contents nor the (original) file name: - which user uploaded that image |
|
- and when |
|
- what caption to use for the image |
|
- does it belog to * a series of images? * an article? * or what? |
|
- what's the mime type |
|
- what's the dimensions (to write proper <img ...> tags later) |
|
etc. pp Remember: if you upload a file to a web server, it gets stored under the user account that runs the web server software (normally there is no such physical user). It's stored in some temporary directory and may even have it's file name changed. |
|
Especially for images: you might want to create thumbnails for them. Or maybe scaled versions (for mobile phone use?). Then there will be multiple files belonging together (and sharing some metadata) |
|
More to come? Yes! Most web caches (including your browser) will not cache responses for urls of the sort /foo/image.php?nr=12345. They will however cache responses for /foo/17/23/1234561723.png. |
![]() |
| Thread Tools | |
| Display Modes | |
| |