Skip navigation
Currently Being Moderated

Replacing all text in book with sample text

Jul 11, 2013 2:08 AM

Hello fellows,

 

Is there a way to replace all the textual content in a book with some sample text using Extendscript?

 

If yes, could you please point me to the relevant APIs?

 

Thank you for your help in advance!

 
Replies
  • Currently Being Moderated
    Jul 14, 2013 4:26 AM   in reply to rombanks

    Of course this is possible, but it might get very tricky depending on the structure of the document you want to process and on the type of sample text you want to replace the content with. If the sample text should more or less look like readable text, you have to figure out a way to cut that text up into substrings and place them in the right locations so that the end result looks like the sample you are trying to create. It is easier if the output can be complete bogus, as you will not need to care about the length of the text strings that are replaced in each of the paragraphs or subparagraphs your script will find.

     

    Lots of issues to handle. Possibly, using the Find and Replace function would be something to look at in the FrameMaker Scripting Guide. But that function is not exactly the easiest to handle from a script. The other option is to walk through all paragraphs in the flow, then walk through all anchored frames, figure out if they have paragraphs in them, then walk through all table cells and process the text in those.

     

    An important issue is what to do with the many markers and anchors that FrameMaker puts in the text flow. These may or may not take up space in the text (depending on the type of marker), and you do not want to remove all of them. You may want to remove the index and cross-reference markers, but certainly not the anchors for tables and such. Also, walking through all paragraphs in a flow does not lead you through the text that appears in tables or in anchored frames.

     

    Good luck

     

    Jang

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 14, 2013 4:50 AM   in reply to rombanks

    It would still come down to walking through the entire list of paragraphs (in all flows that appear on body pages), then through all table cells and also through any text appearing in anchored frames, although I assume that leaving the text in anchored frames (call-outs or sidehead notes) untouched might be acceptable.

     

    I will have a look at the required loops and post a possible solution later today. I do not want to post code that I have not tested first. And this type of script might come in handy for some of my clients, too.

     

    I am assuming that replacing every single character with an 'x' does the trick for you? It would not change the formatting or text flow.

     

    Jang

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 14, 2013 5:17 AM   in reply to 4everJang

    Just a caution: replacing every character with an "x" could definitely change the text flow.

     

    Here is the sentence above with each non-space character changed to an "x":

     

    xxxx x xxxxxxxx xxxxxxxxx xxxxx xxxxxxxxx xxxx xx xxx xxxxx xxxxxxxxxx xxxxxx xxx xxxx xxxx

     

    As you can see, the line is a bit shorter.

     

    Rick

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 14, 2013 5:54 AM   in reply to rombanks

    Yes, the point that Rick was making crossed my mind, too. Using regular expressions, you can easily specify which characters should be replaced. Listing a number of same-width characters and replacing them with an "x" is as easy as this:

     

    sTextString.replace ( /[abcdefghknopqrsuvyz]/g, "x" );

     

    This would leave the non-mentioned characters as they are. Depending a little bit on the fonts that are used, of course, as they might have different character widths for more than the ones I left out. But even if you only replace a couple of characters throughout the doc, it would be enough to get the desired effect.

     

    The difficulty is getting to the text strings and also to replace them. There are many objects for which a GetText method exists, but you have to set the flags such that you actually get the right text strings out of that method. And then you have to figure out where in the doc the text string is, then delete the existing one and add the replacement. All of this has to be done without deleting any markers or anchors.

     

    Another approach might be to set the text location to the first character in the main flow and then walking through the entire document character by character, testing each one and replacing it where required. But I don't think you would get into tables with that method, as those are linked to the running text via an anchor in that running text. So the table cells will have to be processed separately. And if you do have a method to change text in table cells, you can use that method on all paragraphs.

     

    I think Rick has more experience in tweaking text strings. I am usually working on structured documents and only handling the element objects and their hierarchy, not so much changing the text content of documents. Rick, do you have an approach to walk through all text strings in a document and replace them without breaking anything ?

     

    Ciao

     

    Jang

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 14, 2013 6:59 AM   in reply to 4everJang

    OK, I have figured it out. Here is a script that works across all text in the main flow after opening the document.

     

    var oDoc = app.ActiveDoc;

    var oRange = oDoc.TextSelection;

    var oPgf = oRange.beg.obj;

    var oTLoc1 = new TextLoc;

    var oTLoc2 = new TextLoc;

    var oTRange = new TextRange;

    var sNewTxt;

     

    while ( oPgf.ObjectValid ( ) )

    {

              var oTexts = oPgf.GetText ( -1 );

              oTLoc1.obj = oPgf;

              oTLoc2.obj = oPgf;

              for ( i = 0; i < oTexts.length; i++ ) {

                        if ( oTexts[i].dataType == Constants.FTI_String ) {

                                  oTLoc1.offset = oTexts[i].offset;

                                  oTLoc2.offset = oTexts[i].offset + oTexts[i].sdata.length;

                                  oTRange.beg = oTLoc1;

                                  oTRange.end = oTLoc2;

                                  oDoc.TextSelection = oTRange;

                                  oDoc.Clear ( 0 );

                                  sNewTxt = oTexts[i].sdata.replace ( /[a-z]/g, 'x' );

                                  sNewTxt = sNewTxt.replace ( /[A-Z]/g, 'X' );

                                  oDoc.AddText ( oTLoc1, sNewTxt );

                        }

              }

              oPgf = oPgf.NextPgfInDoc;

    }

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 14, 2013 7:02 AM   in reply to rombanks

    Hi,

     

    Your script handles the text items that are retrieved from the document, not the document itself. The test strings returned by GetText are copies of whatever is in the document. So you have to get the document locations, clear the text in the document (without removing any anchors or markers, and then add the tweaked text.

     

    This is why I mentioned it is not as simple as it looks. But my script works, so you can use that.

     

    Thanks for the challenge

     

    Jang

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 14, 2013 7:09 AM   in reply to 4everJang

    Hello again,

     

    I used the script on another document and Frame crashed. Use the script with care. I guess the text selection is not exactly right or the end of the loop is a little buggy. But the main part of the script works. If I find the error later, I will post it here.

     

    Ciao

     

    Jang

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 14, 2013 7:56 AM   in reply to rombanks

    The Undo is a matter of Revert to Saved. I don't think any other method would be feasible, as you really want to replace almost ALL the text in a file.

     

    In the meantime I have experimented on a couple of other, more realistic, test documents and I got some crashes in FM11. So instead of using the TextSelection and then walk through all paragraphs in the document, I now use the proper method to select the first paragraph in the first text frame in the main flow and then walk through the linked list of paragraphs down to the end of the flow.

     

    I am still testing this, as there are unforeseen problems with the selection of paragraphs, so be careful when you want to apply this script, especially in a book. Do extensive testing, single stepping and make backups of all your files before you put this into production. It might be a simple typo but I am not sure of that and I do not have time to make this rock solid - at least not today.

     

    Good luck

     

    Jang

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 16, 2013 5:58 AM   in reply to rombanks

    You have to set oDoc.TextSelection to a text range before methods like Clear or Copy have any effect: they take the current TextSelection as the range on which they are applied. The flags in the Clear and Copy methods define options for the method, such as suppressing any warnings that might otherwise occur or what to do with hidden text.

     

    That is why nothing is deleted if you do not set oDoc.TextSelection. What you can try is replace the 0 flag in oDoc.Clear with Constants.FF_VISIBLE_ONLY and let me know if that solves the problem.

     

    About an earlier question: the oTexts array that is returned by the GetText method contains text string objects and each of them has a property "offset" which gives the offset within the current paragraph.

     

    Good luck

     

    Jang

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 16, 2013 11:20 PM   in reply to rombanks

    Hi,

     

    I am single-stepping through the script and find that it also deletes automatic text, such as auto-numbers in a table of contents. That is where Frame dies. So I have to figure out a way to distinguish editable text from auto-generated text and leave the auto-generated text as is. I am hoping that will solve the crashes and make it rock solid.

     

    I will post a solution later, but it might not be today.

     

    Ciao

     

    Jang

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 17, 2013 12:59 AM   in reply to 4everJang

    OK, I think I have this nailed now. The devil is in the details, as usual. I have adapted the flags for the GetText method so that only the relevant text strings are returned, i.e. a text line that spans multiple lines will be retrieved as one single line. Some irrelevant objects are ignored, but some have to be retrieved and used to not touch the text that follows. This is all pretty tricky work, but the code below does the trick without crashing, at least in some of the documents I have tested it on. It processes one single flow, so if you have documents with more than one flow, you will have to repeat the script for each flow - after first placing the text cursor at the start of that flow.

     

    The script is getting a little too long to post here, so if you drop me an e-mail to jang at jang dot nl I will send the jsx file to you. That saves you copying and pasting and possible mishaps due to the conversion from jsx to web to jsx.

     

    Ciao

     

    Jang

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points