• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Problem to count WORD in PDF file with VBScript

New Here ,
Feb 24, 2017 Feb 24, 2017

Copy link to clipboard

Copied

Hi Colleagues!

Need to complete the script (test) which count how many times WORD appeared in PDF document.

I am using this script

Option Explicit

Dim accapp, acavdocu

Dim pdf_path, bReset, Wrd_count

pdf_path = "C:\Tips.pdf"

Set accapp = CreateObject( "AcroExch.App" )

accapp.Show()

Set acavdocu = CreateObject( "AcroExch.AVDoc" )

If acavdocu.Open( pdf_path, "" ) Then

  acavdocu.BringToFront()

  bReset = 1 : Wrd_count = 0

   'FindText:Finds the specified text, scrolls so that it is visible, and highlights it

  Do While acavdocu.FindText( "Primary", 1, 1, bReset )

  bReset = 0 : Wrd_count = Wrd_count + 1

  Wait 0, 200

  Loop

End If

.....

the problem is that loop didn't finish. It count words on each page till the end (8 pages) and then started again,

Please tel me how can I count all word which I need and exit from the loop.

Thank you in advance

TOPICS
Acrobat SDK and JavaScript

Views

1.5K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Feb 26, 2017 Feb 26, 2017

Copy link to clipboard

Copied

FindText isn't designed for word counting, and it may well loop. It's just a shortcut to showing text in the UI. You probably need to extract all words, check each one and count matched. One way is the JavaScript document.getPageNthWord method.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Feb 27, 2017 Feb 27, 2017

Copy link to clipboard

Copied

You can store the current page with something like:

Set gAVPageView = acavdocu.GetAVPageView

Set gPdPage  = gAVPageView.GetPage

pgn = gPDPage.GetNumber

Save pgn as last page  and if the current page (pgn) is smaller as the last page quit the loop or use this in combination with

PDDoc.GetNumPages().

br. Reinhard

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Feb 27, 2017 Feb 27, 2017

Copy link to clipboard

Copied

LATEST

Mmmh, I just remembered an old vbs study to find text in a pdf. Perhaps you can use that.

   '//Settings: Filename and Word to find
FileNM = "d:\Test2.pdf"
WordTF = "Hello World"

'// Check if file exist
set fs = CreateObject("Scripting.FileSystemObject")
if not fs.FileExists(FileNM) then
     MsgBox "Ups! " & FileNM & " doesn't exist? " & "Try new!", vbExclamation
     WScript.quit
end if

    '//Start Acrobat and Open the File into View
Set gApp = CreateObject("AcroExch.App")
Set gAVDoc = CreateObject("AcroExch.AVDoc")
OK = gAVDoc.Open(FileNM, "")
        if  not OK Then if MsgBox("Error open Basic File") then Wscript.quit

'//comment both out to work hidden

gApp.show  
gAVDoc.bringToFront()

'// let's go

readAndFindText()  '// 15 sec for 100 pages (10 sec hidden in mode)

function readAndFindText()
  set gPdDoc = gAVDoc.GetPdDoc()
  maxPages = gPdDoc.GetNumPages
  foundOnPage = ""
  Set gAVPageView = gAVDoc.GetAVPageView
  for x = 0 to maxPages -1  '// loop over all the pages
       gAVPageView.goto(x)
       Set PdfPage = gAVPageView.GetPage
       Set PageHL = CreateObject("AcroExch.HiliteList")
       PageHL.Add 0,9000  '<<--SET in FILE! (Start,END[9000=All])
       Set PageSel = PdfPage.CreatePageHilite(PageHL)
       for i = 0 to PageSel.Getnumtext - 1  '//loop to get all Words on current Page

           pdfData = PDFData & PageSel.GetText(i)

       Next
       msgbox(pdfData)
       if instr(pdfData, WordTF) then foundOnPage = foundOnPage &x + 1 &","
       'msgBox("page: " &x &" / " &foundOnPage &vbLF &pdfDATA)
       pdfData = ""

  next
  msgbox("found on Page: " &foundOnPage)
end function

Set gPdPage  = nothing

Set gAVPageView = Nothing

Set gAVDOC = Nothing

Set gAPP = Nothing

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines