This may be the cart before the horse, but I'm trying to be prepared for an upcoming document dump. I'm expecting around a million pages. And I will need to search the documents using specific keywords. BUT ... as it is a document dump, I'm fully expecting that the pages will be scanned as individual pages and will not be searchable. I'm thinking I will need to merge the individual documents and I will need to convert them so I can do a word search. Can anyone tell me how to do this? Once again, this is prep work. I don't have the documents yet, but will be under pressure to search them quickly. Thanks.
I made a million page PDF once. You REALLY wouldn't want to do that. And nothing would work: certainly not OCR. This is inconceivably beyond the limits and intentions of Acrobat, whether as one file or many. It's a big project, expect it to take person years. A common approach to OCR is to outsource it to where labour is cheap.