![]() | |
![]() |
| | Thread Tools | Display Modes |
#11
| |||
| |||
|
|
"Steve Howard" <stevedhow... (AT) gmail (DOT) com> wrote in message news:8c80cc88-e48f-4c84-8023-856244e660b6 (AT) r19g2000prm (DOT) googlegroups.com... On Mar 15, 1:36 pm, "Syltrem" <syltremz... (AT) videotron (DOT) ca> wrote: Hello I have a particular set of PDF files that cannot be loaded successfully into an Oracle BLOB column. Do you have an actual error message? No error. It's just that some extra characters are added into the BLOB. For instance, I see that a 0A is added following each 0C (LineFeed added following each FormFeed). So if the PDF contains 50 FormFeeds, then 50 LineFeeds are added, making 50 characters to be truncated from the end of the file (the BLOB length always matches the file's length). And the problem is with INSERTING the file into the BLOB. The extraction from Oracle is not the problem, which I can prove by loading the same PDF with a dot net program, and extracting it successfully with the same extraction procedure. dbms_lob.loadBLOBfrofile does the damage. This is with 10.2.0.4 Yesterday, I downloaded Acrobat Pro trial version. I created a new PDF file with it (from a MS-Word document), loaded it into Oracle, and could not retrieve it in good condition. I have not gone into the details yet but I suspect the same problem. Strangely, I have other PDF coming from different sources, that work without problem. Actually, I was surprised that I could not use a PLSQL program that I used for a couple of years to load other PDFs, to load these new ones. Can't tell what Oracle does not like about these new ones. And yes I see there are many example out there showing how to load and extract PDF with Oracle, but non mentions that some PDF (nor any other file types) are not "supported". Thanks Syltrem |
#12
| ||||||
| ||||||
|
|
"Syltrem" <syltremzulu (AT) videotron (DOT) ca> wrote in message news:qsNfp.19889$Lj7.60 (AT) en-nntp-03 (DOT) dc1.easynews.com... Hello I have a particular set of PDF files that cannot be loaded successfully into an Oracle BLOB column. declare l_blob BLOB; l_bfile BFILE:=BFILENAME('COPY_DB','ORIGINAL_FILE.PDF'); v_dest_offset integer:=1; v_src_offset integer:=1; begin insert into test values(1,EMPTY_BLOB()) returning pdf into l_Blob; dbms_lob.fileopen(l_bfile,dbms_lob.FILE_READONLY); dbms_lob.loadblobfromfile(dest_lob=>l_Blob, src_bfile=>l_bfile, amount=>dbms_lob.getlength(l_bfile), dest_offset=>v_dest_offset, src_offset=>v_src_offset); dbms_lob.fileclose(l_bfile); commit; end; / select UTL_RAW.CAST_TO_VARCHAR2(dbms_lob.substr(PDF,2000) ) from test; A hexadecimal dump of the original imported file, compared to that of the exported file, also shows the difference. Your interpretation seems very unlikely - unless you are using an early version of Oracle that has some (hypothetical) silly bugs that have been eliminated in more recent patches releases. |
|
If this happens to pdf files it would also happen arbitrarily to wav, doc, avi, mov, xls, etc ... and there would probably be lots of people complaining about LOBs in the database not working. |
|
Checks: Is the directory you read from NFS mounted or Samba mounted across platforms ? (Yes, your comment about BLOB/CLOB should make this irrelevant, probably) |
|
Is the character set of the server the same as the characterset of the session that runs your SQL*Plus extract ? (And your comments about BLOB/CLOB aren't relevant here because you're using raw_to_varchar2() to convert a stream of raw bytes into characters. Finally: when you spool the raw_to_varchar2() output, it may be SQL*Plus that is taking every LF and turning it into CRLF. If you want to compare input and output you'll probably have to do it the hard way, use rawtohex() on the substr() result, and compare with an od -x (or similar) of the input file and compare by eye. |
|
Solving the problem - you have a reproducible test case, have you sent it to support ask them what they see ? |
|
Find the smallest file that reproduces the problem - preferably something that fits inside your 2000 byte limit, and sent it to Oracle with your input script, output script, and output file. -- Regards Jonathan Lewis http://jonathanlewis.wordpress.com |
#13
| |||
| |||
|
|
"Jonathan Lewis" <jonat... (AT) jlcomp (DOT) demon.co.uk> wrote in message news:su-dnfjX9eYJ5BnQnZ2dnUVZ8l2dnZ2d (AT) bt (DOT) com... "Syltrem" <syltremz... (AT) videotron (DOT) ca> wrote in message news:qsNfp.19889$Lj7.60 (AT) en-nntp-03 (DOT) dc1.easynews.com... Hello I have a particular set of PDF files that cannot be loaded successfully into an Oracle BLOB column. declare l_blob * *BLOB; l_bfile * BFILE:=BFILENAME('COPY_DB','ORIGINAL_FILE.PDF'); v_dest_offset integer:=1; v_src_offset integer:=1; begin insert into test values(1,EMPTY_BLOB()) returning pdf into l_Blob; dbms_lob.fileopen(l_bfile,dbms_lob.FILE_READONLY); dbms_lob.loadblobfromfile(dest_lob=>l_Blob, src_bfile=>l_bfile, amount=>dbms_lob.getlength(l_bfile), * * * *dest_offset=>v_dest_offset, src_offset=>v_src_offset); dbms_lob.fileclose(l_bfile); commit; end; / select UTL_RAW.CAST_TO_VARCHAR2(dbms_lob.substr(PDF,2000) ) from test; A hexadecimal dump of the original imported file, compared to that of the exported file, also shows the difference. Your interpretation seems very unlikely - unless you are using an early version of Oracle that has some (hypothetical) silly bugs that have been eliminated in more recent patches releases. It's not my interpretation, it's that of the Oracle support person handling my SR. He said the files loaded by loadblobfromfile must be "supported". Was news to me but he said this is so. If this happens to pdf files it would also happen arbitrarily to wav, doc, avi, mov, xls, etc ... and there would probably be lots of people complaining about LOBs in the database not working. I cannot agree more, and tried to expain this to Oracle but failed somehow... Checks: * *Is the directory you read from NFS mounted or Samba mounted across platforms ? *(Yes, your comment about BLOB/CLOB should make this irrelevant, probably) No, the file is local to the server. I did this also with Oracle XE (Win XP) with the PDF on my C drive, with the same results. I'll try that again just to be 200% sure as it makes less and less sense. * *Is the character set of the server the same as the characterset of the session that runs your SQL*Plus extract ? (And your comments about BLOB/CLOB aren't relevant here because you're using raw_to_varchar2() to convert a stream of raw bytes into characters. * *Finally: *when you spool the raw_to_varchar2() output, it may be SQL*Plus that is taking every LF and turning it into CRLF. If you want to compare input and output you'll probably have to do it the hard way, use rawtohex() on the substr() result, and compare with an od -x (or similar) of the input file and compare by eye. Agreed, but it shows the same extra characters in the same spot as I see them in the file I create when the BLOB is extracted with UTL_FILE. But it's not CRLF, it's FFLF or LFFF (can't remember if LF is added in front of FF or after it) And I know it's not my extraction procedure that's wrong as it can successfully extract the PDF it it was originally imported to the BLOB with a dot net program. Unless of course there really is something weird about character set conversion (which should not happend with BLOB). I run the load and extract PLSQL from the same session... so NLS settings can't change. Solving the problem - you have a reproducible test case, have you sent it to support ask them what they see ? They say they can reproduce, and the problem is my PDF which is probably "wrong". ??? Find the smallest file that reproduces the problem - preferably something that fits inside your 2000 byte limit, and sent it to Oracle with your input script, output script, and output file. -- Regards Jonathan Lewis http://jonathanlewis.wordpress.com |
#14
| |||
| |||
|
#15
| |||
| |||
|
|
syltrem: # They say they can reproduce, and the problem is my PDF which is probably "wrong". Why don't you give us a link to somewhere where you put up what you think is a "good version" of your PDF and some other people could check it out. Include in also please a copy of the code you are using to get it in and out ... that way some other people could check it out for you. |
#16
| |||
| |||
|
#17
| |||
| |||
|
|
I'm still in doubt with all Oracle support told me about the files having to be "supported" (or be of an Oracle "known" type). |
#18
| |||
| |||
|
|
In article <XsLip.726032$Bh.128... (AT) en-nntp-12 (DOT) dc1.easynews.com>, syltremz... (AT) videotron (DOT) ca says... I'm still in doubt with all Oracle support told me about the files having to be "supported" (or be of an Oracle "known" type). I think that is utter rubbish. It's just bits. -- jeremy |
#19
| |||
| |||
|
|
On Mar 24, 2:06*pm, Jeremy <jeremy0... (AT) gmail (DOT) com> wrote: In article <XsLip.726032$Bh.128... (AT) en-nntp-12 (DOT) dc1.easynews.com>, syltremz... (AT) videotron (DOT) ca says... I'm still in doubt with all Oracle support told me about the files having to be "supported" (or be of an Oracle "known" type). I think that is utter rubbish. It's just bits. Yes, bits is bits except when you take a binary file and add CR/LF pairs when it shouldn't. IIRC, there is/was a switch somewhere that caused SAMBA to leave the file alone an not try to "translate" it - it behaves as if the file is transferred to the SAMBA share on VMS in ASCII mode which adds the CR/LF because it is not smart enough to know that it is a binary vs a text file. |
#20
| |||
| |||
|
|
In article <XsLip.726032$Bh.128... (AT) en-nntp-12 (DOT) dc1.easynews.com>, syltremz... (AT) videotron (DOT) ca says... I'm still in doubt with all Oracle support told me about the files having to be "supported" (or be of an Oracle "known" type). I think that is utter rubbish. It's just bits. -- jeremy Yes, bits is bits except when you take a binary file and add CR/LF pairs when it shouldn't. IIRC, there is/was a switch somewhere that caused SAMBA to leave the file alone an not try to "translate" it - it behaves as if the file is transferred to the SAMBA share on VMS in ASCII mode which adds the CR/LF because it is not smart enough to know that it is a binary vs a text file. |
![]() |
| Thread Tools | |
| Display Modes | |
| |