2 Replies Latest reply on Apr 11, 2017 7:47 AM by Karl Heinz Kremer

    Adobe DC save as Text takes more time

    viralp90264498

      Hello,

      I am using Adobe DC version : 2015.006.30033

       

      Trying to save text from PDF file having (14 Pages) but it takes around 5 hours to export text.

      While Export, it gives message as below,

      "Page Pass: Page 2 of 14;Evaluate Knowledge Source[2]"..

       

      Why it takes time for such a small page count, what is meaning of above message what adobe looking into internally for souce...

      Please suggest..

       

      Note: here We are using Adobe Export as Plain Text option..

       

      Thanks

        • 2. Re: Adobe DC save as Text takes more time
          Karl Heinz Kremer Adobe Community Professional

          The lack of comments is probably due to the fact that this is not a very common problem (otherwise you would very likely see a lot more complaints about this issue).

           

          Here is how I would debug the issue:

           

          Does this happen with all PDF files, or just this one, or files from the same source? If it only happens with this one file (or similar files created by the same PDF producer), you may want to complain to the author of the PDF file. Chances are they are using a bad PDF generator, or are generating the file in a way that makes it hard or impossible for Acrobat to extract text. You can potentially find out if it's just one page by splitting the file into e.g. 14 individual pages and then process each page separately. If e.g. 13 pages process if a matter of seconds and one takes hours, you know which one is the culprit.

           

          There is one method you could potentially get to a result much faster, even though it's a much more complex workflow: You could try to export the PDF file as high resolution TIFF images (e.g. 600dpi), and then import these TIFF images again into Acrobat. You can then OCR the resulting document, and try to export again. Depending on your export settings, you may not have to run OCR as a separate step, it can be done if necessary as part of the export function.

           

          BTW: There is a new update for Acrobat DC out today. You may also want to download that and re-run your test to see if there were any changes to the export function that would change the performance for your test case.