![]() | |
#1
| ||||||||||
| ||||||||||
|
|
| 'Jack Sprat could eat not fat and his wife could eat no lean'); |
|
| dbms_lob.getlength(uncompressed)); |
|
| dbms_lob.getlength(compressed1)); |
|
| dbms_lob.getlength(compressed2)); |
|
| dbms_lob.getlength(uncompressed1)); |
|
| dbms_lob.getlength(uncompressed2)); |
|
| dbms_lob.getlength(uncompressed1)); |
|
| 'Compressed BLOB with Java Decompress (' | SQLCODE || '): ' || SQLERRM); END; |
|
| dbms_lob.getlength(uncompressed2)); |
|
| 'Compressed BLOB with UTL Uncompress (' | SQLCODE || '): ' || SQLERRM); END; |
#2
| |||
| |||
|
|
In fact, on a large text document of say 3Mb, I found that UTL Compress would compress it to roughly half the size, whilst I would get a ratio of around 8:1 with the Java Deflater class. |

#3
| |||
| |||
|
|
Hello, Does anybody know a reason why UTL_COMPRESS would compress to an almost 2:1 ratio? I'm looking into ways to compress LOBS in either 9i or 10g, the purpose being to calculate compression rates for different sets of text documents as a research project. I was trying the UTL_COMPRESS package on 10g as a method of Limpel-Ziv compression and found that the compression rates that it offered seemed quite poor. I compared this to another compression technique, that of using a Deflater class in Java and found this much much better, regardless of the quality parameters I supplied. An implementation of the Java Compression functions that I used can be found in a posting by Piotr Jarmuz in this AskTom article: http://asktom.oracle.com/pls/ask/f?p...12980645053487 I used the following code to compare: DECLARE rawtext raw(32767); uncompressed blob; compressed1 blob; compressed2 blob; uncompressed1 blob; uncompressed2 blob; BEGIN -- Set up BLOB rawtext := utl_raw.cast_to_raw( 'The quick brown fox jumps over the lazy dog, ' || 'Jack Sprat could eat not fat and his wife could eat no lean'); dbms_lob.createtemporary(uncompressed, TRUE); dbms_lob.open(uncompressed, dbms_lob.lob_readwrite); dbms_lob.writeappend(uncompressed, utl_raw.length(rawtext), rawtext); dbms_output.put_line('Uncompressed Length: ' || dbms_lob.getlength(uncompressed)); compressed1 := compressor.blob_compress(uncompressed); dbms_output.put_line('Java Compressed Length: ' || dbms_lob.getlength(compressed1)); compressed2 := utl_compress.lz_compress(uncompressed, 1); dbms_output.put_line('UTL Compressed Length: ' || dbms_lob.getlength(compressed2)); uncompressed1 := compressor.blob_decompress(compressed1); dbms_output.put_line('Java Decompressed Length: ' || dbms_lob.getlength(uncompressed1)); uncompressed2 := utl_compress.lz_uncompress(compressed2); dbms_output.put_line('UTL Uncompressed Length: ' || dbms_lob.getlength(uncompressed2)); dbms_lob.freetemporary(uncompressed1); dbms_lob.freetemporary(uncompressed2); BEGIN uncompressed1 := compressor.blob_decompress(compressed2); dbms_output.put_line('UTL Decompressed by Java Length: ' || dbms_lob.getlength(uncompressed1)); dbms_lob.freetemporary(uncompressed2); EXCEPTION WHEN OTHERS THEN dbms_output.put_line('Error encountered decompressing UTL ' || 'Compressed BLOB with Java Decompress (' || SQLCODE || '): ' || SQLERRM); END; BEGIN uncompressed2 := utl_compress.lz_uncompress(compressed1); dbms_output.put_line('Java Uncompressed by UTL Length: ' || dbms_lob.getlength(uncompressed2)); dbms_lob.freetemporary(uncompressed1); EXCEPTION WHEN OTHERS THEN dbms_output.put_line('Error encountered decompressing Java ' || 'Compressed BLOB with UTL Uncompress (' || SQLCODE || '): ' || SQLERRM); END; dbms_lob.freetemporary(compressed1); dbms_lob.freetemporary(compressed2); dbms_lob.close(uncompressed); dbms_lob.freetemporary(uncompressed); END; The output that I got was: Uncompressed Length: 104 Java Compressed Length: 88 UTL Compressed Length: 100 Java Decompressed Length: 104 UTL Uncompressed Length: 104 Error encountered decompressing UTL Compressed BLOB with Java Decompress (-29532): ORA-29532: Java call terminated by uncaught Java exception: java.util.zip.ZipException: unknown compression method Error encountered decompressing Java Compressed BLOB with UTL Uncompress (-29294): ORA-29294: A data error occurred during compression or uncompression. PL/SQL procedure successfully completed. That was a small piece of text, just to provide an example, but as you can see the Java compression technique is quite a lot better than that of the UTL Compress method, as I said, I've tried different quality parameters to try to improve it. Also, neither method is able to decompress something output by the other. In fact, on a large text document of say 3Mb, I found that UTL Compress would compress it to roughly half the size, whilst I would get a ratio of around 8:1 with the Java Deflater class. Both techniques say that they implement Limpel-Ziv as in the zlib library, and both are apparently compatible with gzip. I tried a further test, outputing the compressed BLOBS to files and trying to gunzip them (with -n option), but to no avail. The UTL Compressed file wasn't recognized at all and the Java one seemed to be missing information. Now, this is probably due to missing or incompatible headers, as zlib headers are not the same as gzip headers. I also gziped the original uncompressed text, and it came out very similar in size to that of the file compressed by Java. So, let's repeat my question! Does anybody know a reason why UTL_COMPRESS would compress to an almost 2:1 ratio? One might explain it by saying that gzip compresses better than, for instance, 'compress', the function provided by the zlib library, but for a comparison of techniques, the following page shows that it doesn't explain the large difference I'm seeing here: http://www.cs.rit.edu/~std3246/thesis/node36.html |
#4
| |||
| |||
|
|
utl_compress.lz_compress(uncompressed, 1); Had you not used "1" the default level would be 6. |
#5
| |||
| |||
|
|
Hello, Does anybody know a reason why UTL_COMPRESS would compress to an almost 2:1 ratio? |
#6
| |||
| |||
|
|
Kevin Crosbie wrote: Hello, Does anybody know a reason why UTL_COMPRESS would compress to an almost 2:1 ratio? Has anybody got a 10g install that they can at least verify this for me, given the code I posted (with a minor alteration to removing the quality of 1 in lz_compress!)? I would really appreciate it! Thanks, Kevin |
#7
| |||||||||||
| |||||||||||
|
|
Glad to do so if you send the data to be loaded and compressed. |
jarmuz (AT) poczta (DOT) onet.pl">Piotr Jarmuz</a>|
| 'Jack Sprat could eat not fat and his wife could eat no lean'); |
|
| dbms_lob.getlength(uncompressed)); |
|
| dbms_lob.getlength(compressed1)); |
|
| dbms_lob.getlength(compressed2)); |
|
| dbms_lob.getlength(uncompressed1)); |
|
| dbms_lob.getlength(uncompressed2)); |
|
| dbms_lob.getlength(uncompressed1)); |
|
| 'Compressed BLOB with Java Decompress (' | SQLCODE || '): ' || SQLERRM); END; |
|
| dbms_lob.getlength(uncompressed2)); |
|
| 'Compressed BLOB with UTL Uncompress (' | SQLCODE || '): ' || SQLERRM); END; |
#8
| |||
| |||
|
|
DA Morgan wrote: Glad to do so if you send the data to be loaded and compressed. Thanks Daniel, Below is a java package, stolen from Tom Kyte's website, some pl/sql code that should automatically test what I'm asking and finally the results I expect to see: I would like it tested outside of my environment to ensure that it's not a misconfiguration on my part. Java Code: create or replace and compile java source named LobCompressor as import java.lang.*; import oracle.sql.*; import java.io.*; import java.util.zip.InflaterInputStream; import java.util.zip.DeflaterOutputStream; /** * A simple class for LOB compression and decompression in Oracle Database. Will work in 8i and better. * * @author <a href="mailto jarmuz (AT) poczta (DOT) onet.pl">Piotr Jarmuz</a*/ public class LobCompressor { /** * Compresses the CLOB into BLOB * * @param clob the source CLOB (plain text) * @param blob the target BLOB (will hold compressed binary data) it should be an empty BLOB retrieved for example with dbms_lob.createtemporary(l_blob,true); * @throws Exception mostly I/O exception if ever */ public static void compress(CLOB clob, BLOB blob) throws Exception { InputStream in=clob.getAsciiStream(); DeflaterOutputStream z=new DeflaterOutputStream(blob.getBinaryOutputStream()) ; byte[] buffer=new byte[clob.getBufferSize()]; int cnt; while ((cnt=in.read(buffer))!=-1) { z.write(buffer,0,cnt); } in.close(); z.close(); } /** * Decompresses the BLOB into CLOB * * @param blob the source BLOB (compressed binary data) * @param clob the target CLOB (will hold plain text) it should be an empty CLOB retrieved for example with dbms_lob.createtemporary(l_clob,true); * @throws Exception mostly I/O exception if ever */ public static void decompress(BLOB blob, CLOB clob) throws Exception { OutputStream out=clob.getAsciiOutputStream(); InflaterInputStream z=new InflaterInputStream(blob.getBinaryStream()); byte[] buffer=new byte[blob.getBufferSize()]; int cnt; while ((cnt=z.read(buffer))!=-1) { out.write(buffer,0,cnt); } z.close(); out.close(); } /** * Compresses the BLOB into BLOB * * @param slob the source BLOB (plain binary data) * @param blob the target BLOB (will hold compressed binary data) it should be an empty BLOB retrieved for example with dbms_lob.createtemporary(l_blob,true); * @throws Exception mostly I/O exception if ever */ public static void compress(BLOB slob, BLOB blob) throws Exception { InputStream in=slob.getBinaryStream(); DeflaterOutputStream z=new DeflaterOutputStream(blob.getBinaryOutputStream()) ; byte[] buffer=new byte[slob.getBufferSize()]; int cnt; while ((cnt=in.read(buffer))!=-1) { z.write(buffer,0,cnt); } in.close(); z.close(); } /** * Decompresses the BLOB into CLOB * * @param blob the source BLOB (compressed binary data) * @param slob the target CLOB (will hold plain binary data) it should be an empty CLOB retrieved for example with dbms_lob.createtemporary(l_blob,true); * @throws Exception mostly I/O exception if ever */ public static void decompress(BLOB blob, BLOB slob) throws Exception { OutputStream out=slob.getBinaryOutputStream(); InflaterInputStream z=new InflaterInputStream(blob.getBinaryStream()); byte[] buffer=new byte[blob.getBufferSize()]; int cnt; while ((cnt=z.read(buffer))!=-1) { out.write(buffer,0,cnt); } z.close(); out.close(); } }; / PL/SQL package specification: create or replace package compressor is function clob_compress(p_clob clob) return blob; function clob_decompress(p_blob blob) return clob; function blob_compress(p_blob blob) return blob; function blob_decompress(p_blob blob) return blob; end; / And PL/SQL package implementation: create or replace package body compressor is procedure clob_decompress(p_blob blob, p_clob clob) as language java name 'LobCompressor.decompress(oracle.sql.BLOB, oracle.sql.CLOB)'; procedure clob_compress(p_clob clob, p_blob blob) as language java name 'LobCompressor.compress(oracle.sql.CLOB, oracle.sql.BLOB)'; procedure blob_decompress(p_slob blob, p_blob blob) as language java name 'LobCompressor.decompress(oracle.sql.BLOB, oracle.sql.BLOB)'; procedure blob_compress(p_slob blob, p_blob blob) as language java name 'LobCompressor.compress(oracle.sql.BLOB, oracle.sql.BLOB)'; function clob_compress(p_clob clob) return blob is l_blob blob; begin if p_clob is null then return null; end if; dbms_lob.createtemporary(l_blob,true); clob_compress(p_clob,l_blob); return l_blob; end; function clob_decompress(p_blob blob) return clob is l_clob clob; begin if p_blob is null then return null; end if; dbms_lob.createtemporary(l_clob,true); clob_decompress(p_blob,l_clob); return l_clob; end; function blob_compress(p_blob blob) return blob is l_blob blob; begin if p_blob is null then return null; end if; dbms_lob.createtemporary(l_blob,true); blob_compress(p_blob,l_blob); return l_blob; end; function blob_decompress(p_blob blob) return blob is l_blob blob; begin if p_blob is null then return null; end if; dbms_lob.createtemporary(l_blob,true); blob_decompress(p_blob,l_blob); return l_blob; end; end; / Pl/SQL Code: DECLARE rawtext raw(32767); uncompressed blob; compressed1 blob; compressed2 blob; uncompressed1 blob; uncompressed2 blob; BEGIN -- Set up BLOB rawtext := utl_raw.cast_to_raw( 'The quick brown fox jumps over the lazy dog, ' || 'Jack Sprat could eat not fat and his wife could eat no lean'); dbms_lob.createtemporary(uncompressed, TRUE); dbms_lob.open(uncompressed, dbms_lob.lob_readwrite); dbms_lob.writeappend(uncompressed, utl_raw.length(rawtext), rawtext); dbms_output.put_line('Uncompressed Length: ' || dbms_lob.getlength(uncompressed)); compressed1 := compressor.blob_compress(uncompressed); dbms_output.put_line('Java Compressed Length: ' || dbms_lob.getlength(compressed1)); compressed2 := utl_compress.lz_compress(uncompressed, 1); dbms_output.put_line('UTL Compressed Length: ' || dbms_lob.getlength(compressed2)); uncompressed1 := compressor.blob_decompress(compressed1); dbms_output.put_line('Java Decompressed Length: ' || dbms_lob.getlength(uncompressed1)); uncompressed2 := utl_compress.lz_uncompress(compressed2); dbms_output.put_line('UTL Uncompressed Length: ' || dbms_lob.getlength(uncompressed2)); dbms_lob.freetemporary(uncompressed1); dbms_lob.freetemporary(uncompressed2); BEGIN uncompressed1 := compressor.blob_decompress(compressed2); dbms_output.put_line('UTL Decompressed by Java Length: ' || dbms_lob.getlength(uncompressed1)); dbms_lob.freetemporary(uncompressed2); EXCEPTION WHEN OTHERS THEN dbms_output.put_line('Error encountered decompressing UTL ' || 'Compressed BLOB with Java Decompress (' || SQLCODE || '): ' || SQLERRM); END; BEGIN uncompressed2 := utl_compress.lz_uncompress(compressed1); dbms_output.put_line('Java Uncompressed by UTL Length: ' || dbms_lob.getlength(uncompressed2)); dbms_lob.freetemporary(uncompressed1); EXCEPTION WHEN OTHERS THEN dbms_output.put_line('Error encountered decompressing Java ' || 'Compressed BLOB with UTL Uncompress (' || SQLCODE || '): ' || SQLERRM); END; dbms_lob.freetemporary(compressed1); dbms_lob.freetemporary(compressed2); dbms_lob.close(uncompressed); dbms_lob.freetemporary(uncompressed); END; / Expected Output: Uncompressed Length: 104 Java Compressed Length: 90 UTL Compressed Length: 102 Java Decompressed Length: 104 UTL Uncompressed Length: 104 Error encountered decompressing UTL Compressed BLOB with Java Decompress (-29532): ORA-29532: Java call terminated by uncaught Java exception: java.util.zip.ZipException: unknown compression method Error encountered decompressing Java Compressed BLOB with UTL Uncompress (-29294): ORA-29294: A data error occurred during compression or uncompression. |
#9
| ||||||||||
| ||||||||||
|
|
Thanks. But where is the file you want compressed? |

|
| dbms_lob.getlength(uncompressed)); |
|
| dbms_lob.getlength(compressed1)); |
|
| dbms_lob.getlength(compressed2)); |
|
| dbms_lob.getlength(uncompressed1)); |
|
| dbms_lob.getlength(uncompressed2)); |
|
| dbms_lob.getlength(uncompressed1)); |
|
| 'Compressed BLOB with Java Decompress (' | SQLCODE || '): ' || SQLERRM); END; |
|
| dbms_lob.getlength(uncompressed2)); |
|
| 'Compressed BLOB with UTL Uncompress (' | SQLCODE || '): ' || SQLERRM); END; |
#10
| |||
| |||
|
|
DA Morgan wrote: Thanks. But where is the file you want compressed? Well, actually I just wanted to see how the raw text I had set up in the script compressed, but a large text file is a good idea too as we won't be led astray by compression header information etc. I downloaded the following file into my /tmp folder: wget http://www.gutenberg.org/files/18637/18637-8.txt It should be quite compressible as it's not got much real information. ![]() If you set up a user directory as (Given proper permissions etc.): create or replace directory OS_TMP_DIR as '/tmp'; Then run: set serveroutput on DECLARE rawtext raw(32767); uncompressed blob; compressed1 blob; compressed2 blob; uncompressed1 blob; uncompressed2 blob; v_text_file bfile := bfilename('OS_TMP_DIR', '18637-8.txt'); v_length number; v_dest_offset number := 1; v_src_offset number := 1; BEGIN dbms_lob.createtemporary(uncompressed, TRUE); dbms_lob.open(uncompressed, dbms_lob.lob_readwrite); v_length := dbms_lob.getlength(v_text_file); dbms_lob.fileopen ( file_loc => v_text_file, open_mode => dbms_lob.file_readonly); dbms_lob.loadblobfromfile ( dest_lob => uncompressed, src_bfile => v_text_file, amount => v_length, dest_offset => v_dest_offset, src_offset => v_src_offset); dbms_lob.close (file_loc => v_text_file); dbms_output.put_line('Uncompressed Length: ' || dbms_lob.getlength(uncompressed)); compressed1 := compressor.blob_compress(uncompressed); dbms_output.put_line('Java Compressed Length: ' || dbms_lob.getlength(compressed1)); compressed2 := utl_compress.lz_compress(uncompressed); dbms_output.put_line('UTL Compressed Length: ' || dbms_lob.getlength(compressed2)); uncompressed1 := compressor.blob_decompress(compressed1); dbms_output.put_line('Java Decompressed Length: ' || dbms_lob.getlength(uncompressed1)); uncompressed2 := utl_compress.lz_uncompress(compressed2); dbms_output.put_line('UTL Uncompressed Length: ' || dbms_lob.getlength(uncompressed2)); dbms_lob.freetemporary(uncompressed1); dbms_lob.freetemporary(uncompressed2); BEGIN uncompressed1 := compressor.blob_decompress(compressed2); dbms_output.put_line('UTL Decompressed by Java Length: ' || dbms_lob.getlength(uncompressed1)); dbms_lob.freetemporary(uncompressed2); EXCEPTION WHEN OTHERS THEN dbms_output.put_line('Error encountered decompressing UTL ' || 'Compressed BLOB with Java Decompress (' || SQLCODE || '): ' || SQLERRM); END; BEGIN uncompressed2 := utl_compress.lz_uncompress(compressed1); dbms_output.put_line('Java Uncompressed by UTL Length: ' || dbms_lob.getlength(uncompressed2)); dbms_lob.freetemporary(uncompressed1); EXCEPTION WHEN OTHERS THEN dbms_output.put_line('Error encountered decompressing Java ' || 'Compressed BLOB with UTL Uncompress (' || SQLCODE || '): ' || SQLERRM); END; dbms_lob.freetemporary(compressed1); dbms_lob.freetemporary(compressed2); dbms_lob.close(uncompressed); dbms_lob.freetemporary(uncompressed); END; / I got the following output: Uncompressed Length: 3803470 Java Compressed Length: 1307235 UTL Compressed Length: 2266379 Java Decompressed Length: 3803470 UTL Uncompressed Length: 3803470 Error encountered decompressing UTL Compressed BLOB with Java Decompress (-29532): ORA-29532: Java call terminated by uncaught Java exception: java.util.zip.ZipException: unknown compression method Error encountered decompressing Java Compressed BLOB with UTL Uncompress (-29294): ORA-29294: A data error occurred during compression or uncompression. That's 2.9:1 for Java Deflate and 1.68:1 for UTL_COMPRESS using the default quality of 6. Setting the quality to 9 for UTL_COMPRESS the output was: UTL Compressed Length: 2265869 for UTL_COMPRESS, that's still 1.68:1 |
![]() |
| Thread Tools | |
| Display Modes | |
| |