2 Replies Latest reply on Feb 26, 2015 5:18 AM by mrwilhale

    CF9 Solr Hangs on Corrupt PDFs

    mrwilhale

      I am indexing 34000 + documents physically located on the Hard Drive

      Windows Server 2008 SP2

      CF9

      Oracle

       

      Thanks to advice in another thread I started I am indexing the folders one at a time followed by an update after each.  Some of the PDFs can be huge (130mb) but the average is closer to 1 mb.  On occasion I will get to a PDF that is corrupt (If I copy it to my desktop and attempt to open it, Acrobat Pro says it is corrupt).

       

      I have attempted using cfpdf to read header info in a cftry block with the catch creating a log entry.  That should work but it hangs trying to read the doc (assuming that is what is happening with Solr too).  I get no log entry and it will continue to hang until timeout for the request.

       

      Can anyone think of a way to break out of a hung file and continue to index the remaining files?

       

      Thanks