4 Replies Latest reply: Jun 13, 2012 9:56 AM by yerk55 RSS

    Batch converting PDFs to image-based?

    yerk55

      I have a batch of multi-paged PDF documents that are text-based. I want to re-save them so that they are all image-based instead. The increase in file size is not a concern.

       

      I’m using X Pro. But I haven’t been able to figure out how to do so, or if it’s even possible. Thanks in advance for any tips!

        • 1. Re: Batch converting PDFs to image-based?
          Sandeep V. Employee Hosts

          I am not quite sure what you meant by 'image based'. Do you want to flatten the PDF contents? You can do this using Flatten Layers options in Action Wizard. You can also print them to Adobe PDF printer and get new PDFs created. To do this:

           

          Make Adobe PDF printer 'As Default'

          Set the preferences and select 'High Quality' joboptions. (There will be a loss of quality but this joboption will maintain the best possible results)

          Open Acrobat

          File->Action Wizard->Create New Action-> More Tools-> Print. Save this action.

          Now you can execute this action on multiple files.

           

          You can also flatten the PDF using Content->Flatten Layers or Select More Tools->Preflight-> and select Flattening options in the drop down.

          See if this helps.

           

          ~Sandeep V.    

          • 2. Re: Batch converting PDFs to image-based?
            yerk55 Community Member

            Ah, unfortunately I had already tried both of those to no avail. By image-based I mean so the document is similar to as if it were optically scanned in from paper, with no text recognition, and it's not possible to select any individual characters or words in the document, because it's all just one big image of text instead of actual text.

             

            I will explain a bit more about why, if it helps point to a solution. Basically, several different attorneys send PDF documents to my office that they have created. It's my job to merge these documents with PDF's of our own. And then upload the resulting files to a US legal court's website. The court is rejecting many of the files because they contain illegal fonts (for example, the attorneys may use wingdings to create a checkbox) and the file will be rejected because of this font.

             

            Since I am unable to force all these various attorney offices to edit their documents and stop using "illegal" fonts, I must find a way to "clean" them of these fonts, while still keeping the same look of the document. Until now, we have been physically printing them out to paper, and re-scanning them back in. This makes them image-based, containing no font information whatsoever, and so the court accepts them.

             

            But this is ineffecient and wasteful of paper and time. So I was trying to find a way with Acrobat to achieve a similar result, in a batch fasion. I've trying playing around with many different Actions from the Action wizard, but so far have not been able to find something that will do what I'm seeking here.

            • 3. Re: Batch converting PDFs to image-based?
              Sandeep V. Employee Hosts

              Hi,

               

              That's a tough job you do. I appreciate that you are trying to save papers by using something electronically.

               

              So technically if I conclude this, what you do is to convert PDFs to JPEGs(Scanning the printed copy to any format) and then combining these individual JPEGs to PDF. Right? You may scan to PDFs directly but you need to combine them to make one copy of it. Either you do it by an internal feature of your scanner or using Acrobat, that's another thing.

               

              You can do this entire job using Acrobat. Here is my suggestion:

              1). Create an Action to export the PDFs to JPEG/TIFF images. To do this

              • File->Action Wizard->Create new Action->on the right hand side click options

              http://dl.dropbox.com/u/60137234/AdobeForum/Options.PNG

              • Select Export File(s) to Alternate Format and select TIFF(TIFF shows better results)

              http://dl.dropbox.com/u/60137234/AdobeForum/SelectTIFF.png

              Click Ok.

              In the same window you can select if you want to perform this action on a PDF already opened in Acrobat or Select it from your Computer or run it on an entire folder on your Computer. Select them from 'Start With' Dropdown.

              You can select a predefined folder where you want to save these exported images or let Acrobat prompt you every time for this.

              This does the job of Printing and then Scanning those printouts to JPEGs/PDFs.

               

              2). Combine these JPEG/TIFF images to PDF...................and you are done.

                     Create another action for this.

              • Under Create New Actions -> 'Start With' dropdown-> Select Combine Files into a single PDF
              • 'Save to' DropDown -> Select a particular folder or Same folder where the files are located. and Save it.

               

              Now you just need to run these two actions from Acrobat and it will do the entire job for you. If you want you can copy and paste these actions from one machine to another machine. Here is the location on Win7: C:\Users\<UserName>\AppData\Roaming\Adobe\Acrobat\10.0\Sequences

               

              Helping you in this issue is like saving hundreds of papers everyday and that's the idea behind Acrobat. Feel free to write back for more information or if you want me to upload these actions here.

               

              ~Sandeep V.

              • 4. Re: Batch converting PDFs to image-based?
                yerk55 Community Member

                Ah it's close, but then I get a directory full of files like:

                 

                DocumentA_page1.jpg

                DocumentA_page2.jpg

                DocumentA_page3.jpg

                 

                DocumentB_page1.jpg

                DocumentB_page2.jpg

                 

                All mixed together, so they couldnt all be joined together obviously. There are dozens of documents weekly, so the joining process would get fairly manually involved.

                 

                I learned about the "Print as Image" checkbox in the Print dialogue, (Print, Advanced). Checking this box and then printing to the Adobe PDF printer achieves the desired result, but doesnt allow for any kind of batch fashion.

                 

                Sometimes I can get this checkbox to stay checked even after closing and re-opening Acrobat. But even so, if multiple files are selected and printed, it will behave as if the box was not checked. If only this checkbox were an option in the Adobe PDF printer's preferences.

                 

                It's not the prettiest idea ever, but I'm contemplating just making a simple script to simulate user input that will, in a loop: open each document, go to Print, Advanced, check the Print As Image checkbox, reprint it and then move on to the next one.