Copy link to clipboard
Copied
We need to check if users are uploading password protected PDFs into one of our systems. Since we are a bit restricted in using existing libraries that could be utilized to do exactly this, we want to parse the PDF as a data stream and look for an identifier or unique string which lets us make the right conclusion. Does anyone know what we should be looking for and whether this would be at a specific position within that stream? If so, is that identical throughout the various PDF versions from 1.4 up to 2.0 or does it differ from version to version?
Copy link to clipboard
Copied
Look for an Encrypt dictionary in the File Trailer Dictionary. If you need to distinguish Password ("Standard") encryption from other types, you will need to look inside the Encrypt dictionary. The Encrypt dictionary is not, itself, encrypted. See section 3.5 is the PDF specification. Section 7.6 in the latest (ISO) specification.
Copy link to clipboard
Copied
From a quick glance at some sample files I would say that if the file contains the string "/AuthEvent/DocOpen/" it probably means it has a security policy of some kind applied to it. However, this is not a 100%-proof way of detecting it. For that you would need to write a more complex PDF parser, or use an existing library.