Copy link to clipboard
Copied
I have a bunch of historical documents that were scanned as bitmaps, and saved to a PDF. The text is fuzzy, and the client has asked me to clean them up. If this was grayscale, I'd be able to, but does anyone have a great workaround for cleaning up Bitmap images? I've tried Levels, Curves, Auto Contrast, Sharpening...which would all work to some degree if the document were anything BUT bitmap. What am I missing?!
Thanks, geniuses.
Blurred first then used Unsharp Mask. Better but far from good.
Copy link to clipboard
Copied
You need to convert them to grayscale to apply any (Smart) Filters (in this case maybe Dust&Scratches), then you should (in my opinion) save those layered Files and save bitmap copies off them.
But as always: Garbage in, garbage out.
Copy link to clipboard
Copied
Thanks. I forgot to mention that I had converted them to grayscale to work with them. It's just that with bitmap, it's already either black or white...so nothing is really changing that much!
Copy link to clipboard
Copied
So you need to use a meaningful Filter like Dust&Scratches.
so nothing is really changing that much!
For the Adjustments you mentioned probably »not al all«, don’t forget to view the results at View > 100%.
Copy link to clipboard
Copied
Might be better to consider re-scanning using OCR, you will then have proper searchable, useable, readable digital text.
Copy link to clipboard
Copied
No argument there, but as the OP mentioned
historical documents that were scanned as bitmaps
I suspect the actual pages might not be at their disposal at all.
Copy link to clipboard
Copied
I wish that were an option! These are historic documents (some from the 1800s). I'm not even sure the originals exist anymore!
Copy link to clipboard
Copied
If it's worth it, it might be worth experimenting to see if this text can be digitised into live text.
Copy link to clipboard
Copied
Great idea, but these are legal, historical documents, so they really can't be changed in any meaningful way (just cleaned up, etc.).
Copy link to clipboard
Copied
The best combination – if you have the resources – is to have the rasterised image so users can see the original image design and OCR text that's readable and searchable.
Copy link to clipboard
Copied
Blurred first then used Unsharp Mask. Better but far from good.
Copy link to clipboard
Copied
This is the best I've found as well. Yours is a bit better than mine with that method. Do you remember what your settings were?
Copy link to clipboard
Copied
As I mentioned earlier: Garbage in, garbage out.
Copy link to clipboard
Copied
I think Gaussian blur was around 6.5, then about 500% and 14 pixel radius for sharpening. Make the layer a smart object first, then you can tweak them as needed.
Copy link to clipboard
Copied
Thank you so much, edgrimley. This will definitely work.
Copy link to clipboard
Copied
There is similar topic in Photoshop Scripting section. That won't give you what you want, but I see it is somehow related as was made to clean old documents. Last version of script is not posted as a whole but in 3 parts, so do not think that correct solution have everything there should be. If later someone wants to have a script as a whole let me know so I post it in one piece: How to remove small black dots from text page - selecting pixel radius via script