Skip navigation
colinodden
Currently Being Moderated

Logging redactions

Apr 6, 2012 9:21 AM

Tags: #redaction #highlighting #logging

Dear You,

 

I'm operating on a large corpus of documents where human coders are looking at OCR-ed PDFs. We're trying to facilitate their coding work by highlighting relevant search terms. Redaction actually works pretty well for that, as highlighting with an empty red rectangle draws attention to stuff likely to be of interest.

 

Let's say, though, that some documents match 5 words in the document, some match 30, some match whatever. It would help our project immensely to know how many matches there are in a given document. We also want this to happen automagically -- counting by hand is beside the point.

 

Logging would be the holy grail ... even if there's a redaction log with a bunch of noise in it, we're willing to parse the log to get what we want. Yet, we can't find any evidence that Acrobat (Windows or Mac, v9, but we're willing to shell out for X if it gives us this functionality) logs much of anything it does to a document.

 

Many thanks, and in advance.

 

Colin Odden

Ohio State University

 
Replies
  • George Johnson
    11,671 posts
    Aug 11, 2002
    Currently Being Moderated
    Apr 6, 2012 9:30 AM   in reply to colinodden

    Are you saying that you're using the Search & Redact feature? It's possible with a script to count how many redaction annotations are present. It's also possible with a script to search through a document for a word and automatically add text highlights, and then count how many there are.

     
    |
    Mark as:
  • George Johnson
    11,671 posts
    Aug 11, 2002
    Currently Being Moderated
    Apr 6, 2012 11:39 AM   in reply to colinodden

    Is all you're looking for is a count of the number of redaction annotations that were added? I'm assuming that the redactions aren't actually applied by the batch prcoess. Is that correct?

     
    |
    Mark as:
  • George Johnson
    11,671 posts
    Aug 11, 2002
    Currently Being Moderated
    Apr 6, 2012 12:54 PM   in reply to colinodden

    Here's a simple script that will count the number of redaction annotations in a document and show the total in an alert popup.

     

    syncAnnotScan();
    annots = getAnnots();
    var sum = 0;
    
    if (annots) {
    
        for (var i = 0; i < annots.length; i += 1) {
            if (annots[i].type === "Redact") {
                sum += 1;
            }
        }
    }
    
    app.alert("Total redaction annotations: " + sum);
    

     

     

    This just demonstrates that you can determine the number of redaction annotations with a script. You can adapt it to suit your needs. For example, you could use it in a batch process and write the number for each file to the JavaScript console by changing the last line to:

     

    console.println(documentFileName + ": " + sum);

     

    When you open the console (Ctrl+J) after the batch process, it will show a line for each file that shows the file name and the number of redaction annotations.

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points