Any help is greatly appreciated!
Long story short - I can provide more details if necessary - I have a collection of jpg files that are pictures of documents - no images - that I had to take to copy them instead of zeroxing them. I am trying to convert them into pdfs that are:
1.) as black and white as possible so as to make it as close to printing a regular black and white word document as possible (instead of using a lot of printer ink on printing pictures that just happen to be saved in pdf format), and
2.) searchable vis-a-vis OCR.
After a recommendation from a teacher, I purchased Adobe Acrobat a few days ago which I've never used before. That being said, I can convert them into pdfs, I can make them searchable because I've played around with the OCR and that seems pretty straight forward.
However, no matter how many times I try, no matter how many different preferences/settings/etc. I play around with (as an aside, how the hell do I revert these settings back to default? I can't remember what they all were beforehand!), I can't get the pdfs to look like anything other than the same jpg image but now saved as a pdf.
I've also been recommended ABBYY FineReader, and I've tried that as well and it seems to do a much better job, unless I'm missing something.
I'd share the documents, but I can't figure out how.
EDIT: After uploading pdfs to acrobat.com, taking 30 mins to get workspaces to open up, but then not being able to transfer the documents from acrobat.com to workspaces, I finally uploaded them again from my desktop to workspaces - WHY IS THIS SO DIFFICULT?!?!
So my questions basically is this: How do I get adobe to make files like this:
look like this:
This has been a very frustrating day. Any help is appreciated.
The URL's return document not found.
You've mixed a few common terms that infer meanings;
...pictures of documents - no images - that I had to take to copy them instead of zeroxing them. I am trying to convert them into pdfs that are:
I cannot infer whether you took a picture of or scanned a paper document. Regardless, you have an image that you want to convert to a Word Document? OCR was the first step. Given that there are no images, Save As MS Word Document (it may be an export; Adobe has moved or renamed the feature with recent releases)
Thanks for the response and sorry for the confusion!!!
First, I took pictures of documents, so I have jpegs of documents I want to turn into searchable pdfs that won't waste a lot of ink to print.
Here are the links again (I "published" them so they should work now). Both are from the same jpeg, but were made with different software - one with Adobe acrobat and the other with abbyy finereader.
My question is how do I adjust Acrobat so that my pdfs of these jpegs go from looking like this:
to looking like the pdfs I made on abbyy finereader:
Ok, people tend to use "black and white" to mean "shades of grey". I see now that the picture (photo?) is shades of grey already, but what you are aiming for is a page with all the grey gone, and only black colours on white paper. To avoid confusion, I'll call this "monochrome".
When you scan pages, this is a setting you choose in the scanner. If the page is coming from another source, I don't believe there is any way for you to do this - Acrobat has no such function so far as I know.
You could use other programs to work on your images before going to PDF, and that is probably the best solution. Photoshop can turn to monochrome.
If using a scanner is any kind of option, I'd strongly recommend it. Digital photographs are definitely second best.
I appreciate the feedback, and was forced to take digital photographs per the facility rules. I just wanted to know if I could take the jpegs and, using acrobat, turn them into pdfs that look like the one I made on ABBYY FineReader - see my previous post for examples of what I've been able to produce with each.
If acrobat cannot make pdfs that look like the one I made on abbyy finereader - and unless someone else has any input it appears that it cannot despite the copious features Acrobat has - I will just return acrobat and get a refund sometime this week. Any other input would be appreciated and thanks for those who have already chimed in.
I tried that but the result isn't what I'm looking for - the "white" of the page is still gray and the contrast isn't as strong as I want it.
I need it to look like this (which was done with the same jpeg files but using abbyy's black and white mode):
The file in the above link is the pdf I created using ABBYY FineReader. A few posts up is the same link, along with a link to a pdf I made on Acrobat from the same jpeg file. I actually have hundreds if not thousands of these pages to print which is why I want them to look black and white like the one from ABBYY, and typing them is not a viable option for obvious reasons.
Here - I just uploaded the jpeg I used for the above examples so that you guys can see everything.
So to sum up:
I have a bunch - hundreds if not thousands - of documents I had to digitize using a webcam (long story but I had no other choice). Here is an example: https://acrobat.com/?d=WwCbIo*kb6oO*azvmGixAw
I have used both Acrobat X Pro/XI Pro to create searchable pdfs of these jpegs and they look something like this: https://acrobat.com/?d=KJskXucbh2pBBPuZmsL6Hw
Trying the various preflight and other features only provide different shades of gray, literally and figuratively, from the above link.
However, when creating searchable pdfs on ABBYY FineReader, using their black and white mode, I get them looking exactly how I want them, like this - https://acrobat.com/?d=RrXBupJrVFQBKNT*DdqK5g
As you can see, it looks A LOT different. It is also easier to read, easier for ocr, and uses a lot less ink when printing, especially by the hundreds.
I want to make the pdfs I make on Acrobat look like the ones on ABBYY FineReader.
Therefore, (using the same links as above), my question is how, using Acrobat XI Pro, do I get this https://acrobat.com/?d=KJskXucbh2pBBPuZmsL6Hw
to look like this https://acrobat.com/?d=RrXBupJrVFQBKNT*DdqK5g
While that is definitely an improvement - and I really appreciate your help with this - I would like to preserve the original appearance of the pages instead of just extracting what the OCR can pick up. Acrobat's OCR does a good job, but unlike the page I'm using for these examples, a lot of the pages have some faded text or even handwriting that the ocr will not pick up and in your example, just now show up at all.
If you want the intial format where the page is placed on the glass of the Scanner (pages from a Book). I'm afraid that is the best your going to do.
The grey background you see is where there is light leaking in because its nt able to seal light tight. And the distortions in the print is caused by the paper not exactly laying 100% flat.
The only way your going to get a perfect scan is cut the pages out of the book so it can be layed on scanner perfectly flat and the Cover pressed light tight.
There are scanners that look something a three ring Hole punch and when you get ready th scan you pull the paper through. You might be able to push the paper all the way through then press scann and pull it back through. That might do a better job.
Anyway books, Magazine will exibit the qualities you mention on a Flat bed scanner.
If your good at using a Graphics editor such as GraphicConverter you might be able to remove the gray background.
Thanks to everyone who has responded to my questions, but I have decided to return Acrobat Pro and stick with ABBYY FineReader.
Perhaps I'm misunderstanding your responses, but I can't help but feel as if either nobody here is able to see the differences between the two pdfs I link to above, or nobody here believes me when I say that the jpeg file (link above) I used to create the pdf with ABBYY (link above) was the same jpeg file I used to create the pdf with Acrobat (link above).
Instead, the consensus seems to be that the only way I could have possibly created the kind of perfect-looking black and white (or monochrome or whatever) pdf that I made with ABBYY (as opposed to pdfs Acrobat makes, which look and print exactly like the jpeg images but with various grey filters applied to it) was by using a scanner, a perfectly flat document, and maybe even graphics editing software.
None of these are needed because that isn't the issue here. I wasn't allowed to use a scanner anyway as is often the case when digitizing documents from a university's library archives.
The only thing you need to do is not use Acrobat but use ABBYY FineReader 11 instead. FineReader was not only able to convert the jpegs (despite the fact the jpegs were made not with a scanner but a webcam that was propped above the document) into pdfs that looked like the one above, but did so unbelievable easy - simply by selecting the black and white mode option before the conversion.
Again, I appreciate everyone who took the time to read and respond to my questions, and perhaps in the field of pdf conversion and creation what I'm asking to do isn't often done or in high demand, which may perhaps be why Acrobat can't do it. Admittedly even on ABBYY's software this black and white mode is a brand new feature. Nevertheless, the fact that ABBYY's software can do this so easily, even with imperfectly taken jpeg files, AND produce OCR that is more accurate that Acrobat's, all for less than half of Acrobat's cost, I can't help but question why Adobe can't get their feature-laden, labyrinthian software to do this, especially for 450 bucks.
Also, with all due respect, when someone is asking if acrobat can do something with already-existing files (that couldn't be made with a scanner) that a similar software can do with those same files, responding by suggesting that they throw out the files they already made and have to use, go back and recreate new files in a way that was already determined to be unavailable is both unhelpful and a bit frustrating, especially you come off in your suggestions as seeming to ignore or disbelieve that the other standalone software actually did this in the first place.
Thanks again for everyone's time, and I hope this thread is of some use in the future to anyone trying to do what I'm doing now.