Generally speaking, this problem is the symptom of an improperly created PDF file in which some type of unusual font and font encoding is used. One way to check this out is to try simply copying and pasting from text from one of these PDF files to a Word document. If what you get is the same “gibberish” in the Word document using copy and paste, then in fact the issue is the PDF file's encoding and there really isn't anything that Adobe can do about that.
(It may be Bank of America, but it is amazing how many PDF files coming out of “enterprise” systems have somewhat unusual and/or defective encoding of their content.)
Please let us know what you find from that experiment. (We understand that you might be fairly reluctant to post the file for us to examine!)
Save the file as TIFF files, combine the TIFF files in Acrobat, and convert it to Excel.
This data will cut and paste into Word perfectly.
I've downloaded a trial of Nitro Pro 11 and it works like a champ.
So, your competition can do it, but again, the answer is, its a bad PDF and Adobe can't solve the problem? From one of the largest banks in the country? Isn't there a process to get a sample from the source, review it and figure this out?
Ok, that worked. Its a functional, but tedious and ugly work-around. For those that don't know how, here are the steps.
Create TIFF format file(s). I ended up with 10 TIFF files per statement
1. Open PDF file in Adobe DC
2. File/Save As/Choose Save location/Set "Save as type" to TIFF
You will likely get more than one TIFF file. Be sure to track this as its important in the next step. If you have multiple PDFs that you need to process through this work-around, then I recommend that you clean them up (delete TIFF files) after each file is processed.
Now you have to "combine" these files into a new PDF. You can open them all at once by selecting multiple files then combine from within Adobe or you can select files in their location without opening and then "combine"
2. Choose combine files
3. Select method (already opened files or from their location)
4. Select files, hit Open button
5.You should get a screen showing your selected TIFF images/files, hit Combine in upper left of Adobe screen
6. Green check marks will appear as they are processed, and you will get a new PDF with the name "Binder#"
Now export to Excel
1. Tools/Export/Excel to get the date into the spreadsheet
a. Tested with all data imported into one worksheet
b. Tested with data imported into separate worksheets per page
Thank you for the work around, now, how about a fix?
I can't change your PDF file.
Bernd, thank you for your support...it was helpful and very much appreciated. I did not mean to direct my frustration to you, didn't realize you weren't on Adobe staff.
Most of the people you find here on the Adobe forums are not Adobe employees, but users of Adobe's different applications. As you can see when you scroll up, Adobe employees are identified with a "Staff" badge.
The fact that you can copy and paste into Word and get the correct content does point to a problem in Acrobat. You may want to file a bug report here: Feature Request/Bug Report Form That is the only reliable way to notify Adobe of a problem in their products.
To be very clear, I did not say that your particular files were bad but that in the past we saw this type of symptom with some bad PDF files.
Thanks for testing the copy-and-paste text. That would indicate that in fact you have found a real issue. What we need to be able to do is analyze the issue and fix it. However, without a sample file, that is difficult.
If you could contact me with a private message and a pointer to such a sample, I'll gladly pursue this within Adobe and we will maintain the confidentiality of the sample file.
While some time has past since the original post it is getting to be tax time...
I too use Bank of America and have for a number of years extracted column data from monthly statements. This year (2017/2018) checking account statements appear different from prior years as neither Save As nor Export To accessible text were working. After spending about 8-hours on a Saturday, have found another solution that works.
Typically, I create a single Binder1.pdf from 13-14 monthly statements using Acrobat DC File-> Combine Files into a Single PDF... Shift, Select them in order, Save As Binder1.pdf.
The new wrinkle is you need to run Binder1.pdf through Acrobat's Accessibility checker. I ran Full Check, which allows for fixing some items with a single Right click. Unfortunately, had to "fix" several items, including Primary Language so not sure if there was a single "silver bullet". After that I was able to Edit->Select All, Cntl-C, then Cntl-V into my fav text editor, TextPad. This time, no strange umlauts nor other non-printing characters.