• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

isPDFFile() and CFPDF won't recognize a PDF

New Here ,
Oct 01, 2008 Oct 01, 2008

Copy link to clipboard

Copied

I have a PDF that I can open up in Adobe Acrobat Reader 8.0 without problems. However, if I use the CFPDF tag to action = "read", I get the following error:
"Error: Invalid Document D:\temp\test.pdf specified for source or directory. "

If I use the isPDFFile() with the same file, I get a return of false.

The only way around this is using the <cffile action="readBinary"... and using the <cfontent> tag.

I had this same problem on 8.0 on a version 1.2 document, but installing 8.0.1 fixed it. This is a 1.3 document.

Here is a link to the file
test.pdf

Is this a bug in CF?
TOPICS
Advanced techniques

Views

2.2K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Oct 01, 2008 Oct 01, 2008

Copy link to clipboard

Copied

I can confirm the behaviour you're seeing. I think it's a bug. You should
raise as such with Adobe.

--
Adam

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Valorous Hero ,
Oct 01, 2008 Oct 01, 2008

Copy link to clipboard

Copied

Yes, I get the same results with some sort of exception about the trailer.

com.adobe.internal.pdftoolkit.core.exceptions.PDFCosParseException: Expected &apos;trailer&apos; : 66066
at com.adobe.internal.pdftoolkit.core.cos.XRefTable.readTrailer(Unknown Source)
at com.adobe.internal.pdftoolkit.core.cos.XRefTable.parseTableXrefChain(Unknown Source)
..

Interestingly cfpdf can read the file if you copy it with iText first. So maybe it is a bug or if it is a malformed document maybe iText is more forgiving ..?

<cfscript>
pdfFileIn = "c:\pathToFile\test.pdf";
pdfFileOut = "c:\pathToFile\testCopy.pdf";
pdfReader = createObject("java", "com.lowagie.text.pdf.PdfReader").init( pdfFileIn );
streamOut = createObject("java", "java.io.FileOutputStream").init( pdfFileOut );
pdfStamper = createObject("java", "com.lowagie.text.pdf.PdfStamper").init( pdfReader, streamOut );
pdfStamper.close();
streamOut.close();
</cfscript>

<cfpdf action="read" source="#pdfFileOut#" name="pdfContent">

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 02, 2008 Oct 02, 2008

Copy link to clipboard

Copied

Adam - Thanks for taking a look. I submitted the bug.

CFSearching - Thanks for taking a look and providing the sample with iText. That made it work for me too. This will help me create a workaround for now. I haven't familiarized myself with iText yet. That's going to change.

BKBK - You may be correct on the 'corrupt as a PDF binary'. However, I had this same problem with a 1.2 versioned document when using 8.0 to display it. After doing the 8.0.1 update, that document looked fine, so it's hard to tell is it's an acrobat issue or a CF issue. Either way, it's an Adobe issue.
On the counterfeit remark - please don't dig too much into that. This is a court document created by an attorney using bankruptcy software and some word processor, printed it, scanned it, ocr'd it, and submitted it to our system. The copyright info you see in the margin is for the bankruptcy software he uses.
On the personal information - This document is a matter of public record and can be downloaded on the internet for a fee (.08/page). You may also drive to Birmingham, AL and view it for free in our reception area. Once you file Bankruptcy, your case information becomes public.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Valorous Hero ,
Oct 02, 2008 Oct 02, 2008

Copy link to clipboard

Copied

quote:

Originally posted by: zbis12
On the counterfeit remark - please don't dig too much into that. This is a court document created by an attorney using bankruptcy software and some word processor, printed it, scanned it, ocr'd it, and submitted it to our system.

The copyright info you see in the margin is for the bankruptcy software he uses.



Not being a lawyer I could not comment on copyrights. But I will say that, given the number of steps involved, it certainly seems like there is an opportunity for file corruption.

quote:

Originally posted by: zbis12
On the personal information - This document is a matter of public record and can be downloaded on the internet for a fee (.08/page). You may also drive to Birmingham, AL and view it for free in our reception area. Once you file Bankruptcy, your case information becomes public.



I suspect you are 100% correct about it being public record. That said, my personal approach falls in the "just because you can, does not mean you should" category 😉 Even when I am not bound by some sort of NDA, I try to not to disclose client related information. IMO it is just good business sense, for both the client and myself. Plus, it just shows some basic courtesy to all parties involved. Something that is often in short supply these days.

(My $0.02 for what is worth 😉

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 02, 2008 Oct 02, 2008

Copy link to clipboard

Copied

Me smell a fish. The document is very likely corrupt. In more ways than one.

First, it is corrupt as a PDF binary. Coldfusion has done you a favour and told you so.

Secondly, it is as corrupt as counterfeit. Two reasons:

1) Pages 2,3 and 4 contain copyright information in the margin, which page 1 lacks.
2) The font type changes when you go from page 1 to page 2.

IsPDFFile is a blunt tool, and makes no firm promises. The documentation tells you, "This function returns False if the value is not a valid pathname to a PDF file, the pathname is null, the PDF file is not valid, or the PDF file is corrupted.".

The fact that this is a legal document makes the fish even smellier. By the way, if those are really someone's private details, what are they doing here?




Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 02, 2008 Oct 02, 2008

Copy link to clipboard

Copied

Zbis12,
I understand what you say. However, it doesn't go against my remarks. In fact, you can verify what I said.

There is a change in font. There are also clear signs that the material had undergone image processing and perhaps a merge operation before it finally became a PDF document. All of which increases the chances for the file to be corrupt.

The fact that you can open a PDF with certain software, including Adobe's own Reader, is not a guarantee that the file is error-free. Observe it for yourself.

Open your file, test.pdf, in a text editor. Replace each occurrence of 65775 with 66066. Save the file as test2.pdf. Now, perform the exercise that -==cfSearching==- gave earlier, using test2.pdf instead. You will see that isPDFFile("testCopy.pdf") returns "Yes", even though you've blatantly corrupted the original file.

I wouldn't submit a bug report on this one. I think it is the document that has a bug, not Coldfusion.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Oct 03, 2008 Oct 03, 2008

Copy link to clipboard

Copied

> I wouldn't submit a bug report on this one. I think it is the document that
> has a bug, not Coldfusion.

I'm split on this one. I can see where you're coming from, and am mostly
inclined to agree.

However I think an equitable benchmark for the function could be "will it
open in Acrobat (reader)?"

It does, so it's reasonable for that test to pass.

A relevant aside here... I presume docs created using a current version of
Acrobat will not necessarily open in older versions of Acrobat, unless some
sort of compatibility mode is used. In this light, shouldn't the function
take an optional "PDF version" argument too? Or is there a minimum
standard that qualifies as "is a valid PDF"?

(I never work with PDFs, so am completely ignorant of this sort of thing).

--
Adam

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 03, 2008 Oct 03, 2008

Copy link to clipboard

Copied

LATEST
Adam Cameron wrote:
A relevant aside here... I presume docs created using a current
version of Acrobat will not necessarily open in older versions of
Acrobat, unless some sort of compatibility mode is used. In this
light, shouldn't the function take an optional "PDF version"
argument too? Or is there a minimum standard that qualifies
as "is a valid PDF"?


Good question. Readers from Adobe should take note, pun intended, naturally.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation