I'm having the same error on CF7 - PDFs created with Acrobat PDFMaker 9.0 for Word are not indexed, they are producing the "Stream error (-140) - SKIPPING" error.
Xamax did you find a solution to this problem? Or can anyone else help?
I have figured it out....
Adobe Acrobat v8 saves PDFs in v5 (1.4) format by default. Adobe Acrobat v9 saves PDFs in v7 (1.6) format by default.
ColdFusions CFINDEX doesn't like PFDs saved in v6 (1.5) or higher. So if you upgrade to Adobe Acrobat v9 and want to use the PDFs in a CF verity search you need to configure Distiller to save in v5 (or earlier) format so they can be indexed. (Or just resave the files to be optimised for v5 after they have been created.)
My solution was to configure the application to do a CFFILE READ on the PDF when a user uploads the document into the system. The first 8 characters of the file contents indicate the PDF version. v4 and v5 will work. v6+ won't work.
<cfif Mid(fileContents, 6, 3) lte 1.4>
<!--- This is PRE version 6 - it will be fine in the search --->
<cfelseif Mid(fileContents, 6, 3) gt 1.4>
<!--- This is version 6-9 or higher - it won't work in the search - ABORT --->
Hope this comes in handy for someone one day!
I have the same issue. Hard to understand how Adobe would undermine the ability to use Verity Search when we and so many others use CFINDEX and CFSEARCH so heavily over out Intranet document repositories.
Does anyone know if this is remedied if CF is upgraded to CF 9? I can't fathom having to entirely retrofit a tried and proven method of document management in our enterprise.
As far as I can tell Verity ceased to be a product a few years ago, when Autonomy bought Verity (the company). I would not expect there to be any changes to how Verity works in CF. You should port off Verity as soon as you can. CF9 adds the Solr search engine - which runs atop of Lucene - as a replacement to Verity. Verity is still in there, but I think you can consider it well and truly deprecated.
This is not based on any official info from Adobe or Autonomy, just based on my experience with Verity, and trying to get support for it over the last few years. I've given up on it.
Thanks for the feedback it is appreciate.
I would have thought and it would have been nice for Adobe to have advised that CFINDEX and CFSEARCH was depracated for file type PDF where PDFs are created and saved with newer versions of Acrobat.
We have huge investments in our document library and retro fitting it all to another search engine is not small task.
Adobe really does drop the ball on customer satisfaction and advisories involving gobbled up company product lines like Allaire/Macromedia Cold Fusion.
Yes, Adobe seem to have to position of "well Verity ain't our product... what can we do? [shrug]", which is somewhat lacking, IMO.
That said, it might be an idea to test a straight port from using Verity collections to using Solr collections and see if there actually are any issues, before starting to wail or gnash your teeth too much. Some of the search syntax will be different between the two engines, but other than that you might find there are't that many problems switching over.
There's no point getting upset about a potential problem. when you don't know the problem will actually present itself.
Do you know if Solr will index PDF documents created with Adobe Lifecycle Designer ES 8.2 thar in turn generates PDF for Acrobat 7.0 and later or will Solr have trouble with this version of PDF as well?
We don't want to start jumping through hoops and going to Solr when we may wind up with the same issue.
I have no idea.
Did you read the Solr docs?
I can't see it being very hard for you to knock together a quick test, either..?