I have several OCRed files and now I need to delete the OCR information.I know that the info is stored as containers into the pdf. There are <snap> containers and <artifact> containers. The <snap> containers contain the ocr information, the <artifact> containers contain the image information.
I know that some people used the pdf-TIFF-pdf workaround, but I have really large files so this option is very time consuming.
Does anybody have an idea how a script should look like?
I would be very happy if somebody have a solution or at least knows a possibilty to get the desired result.
Thanks in advance,