Currently we have a huge folder with allot of pdf files in it and we are running an index service on it. The index service is used on a couple of web sites that searches inside the documents for key words. It works fine with pdf's,word etc. We recently received pdf files from an external source and the index service can't read any of its contents. In the pdf, personal information is stored i. e. telephone numbers, identity numbers etc. We need to search these pdf files to retrieve them to the user.
I then tried converter program, I'm using C# and Java as language, but they say that I need to upgrade adobe. I also tried dll,jar i.e. itext etc to try and retrieve the values from the files but they all give me a message that I need to upgrade adobe. So, the next thing is that I downloaded a trail edition i.e. adobe 9 extended. It is still giving me the same message. Even if you open the file and try to search for data i.e. 1234 - adobe says that it can't find it but it is there in a field. If you search 1 2 3 4 then it finds it but the pdf needs to be open to search like that.
Could someone please help me as we have a hundreds of these files which we can't search within. The last resort would be to open them one by one and save the filename in a database to connect to the user and that can take forever.