15 Replies Latest reply on Apr 4, 2013 5:58 AM by JADarnell

    Scanning text in Javascript

    JADarnell Level 1

      I have some code that looks like this (please forgive the table format...that seems to be what happens when I copy/paste from UltraEdit into the forum screen):


      //  Shortened for brevity...


      var TheWord  = LocalDoc.getPageNthWord(ThePage, b, false);
      var WordLength = TheWord.length;
      var LastChar   = TheWord.charAt(WordLength - 1);
      var LastCharCode = TheWord.charCodeAt(WordLength - 1);


      //  Does the remaining string have a "-" in it?  This is not the
      //  normal dash, but 0xAD (decimal 173), i.e. a soft hyphen.
      //  Remove it.
      //  Something strange happens on the Mac.  Though the char is detected as
      //  a 173, neither Search or indexOf will actually detect the hyphen.
      //  I am going to try searching for it char by char and then  doing something then.
      AddTrace("We are about to look for the hyphen.  The word is |" + TheWord + "|", 145);
      var HyphenIndex = TheWord.indexOf("-");
      AddTrace("The position of the hyphen is " + HyphenIndex);   // returns -1
      if(TheWord.search("-") != -1)
      {  //  Never executed


      To summarize:

        I have a block of text that I must scan and pull out URLs and email addresses.  If either appear at the end of the line and are too long the Acrobat text processor hyphenates the word, using  a char with charcode of 173 (0xAD).  On the Mac when I use getNthPageWord() to retrieve that word, the whole word, including the hyphen, is returned.  So the returned word would look something like this:




      (Please note that though I am using a standard hyphen for this discussion, in the actual code I am using the character with  character code 0xAD. I have triple-checked this.)


      I have used String::charAt() and String::charCodeAt() to actually look at all characters in the string, one by one and in the expected spot, both functions return either the hyphen character or character code 173.


      The bottom line is that even though using the String functions mentioned above to determine that the character is in fact in the string, when I use the search() function or the indexOf() function, both specifying "-" as the string to be searched,  the return value is -1.


      Though I have  not tested the search function on single chars, I have tested the indexOf function, and it works just fine for other single characters. 


      What am I not understanding about the String object, the indexOf() function and the search() functions?