• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

How to get formatted text into arrays

Community Expert ,
Mar 10, 2015 Mar 10, 2015

Copy link to clipboard

Copied

Dear experts and helpers,

For my project I import an RTF file and then read the data from it into 3 arrays. This works fine when just using the string contents of the paragraphs. However, the final script should be able to read and replace formatted text...
Why use the intermediate arrays? Because otherwise I need to switch back and forth between two fm-documents (and one may be a book component).

The imported file starts with a number of lines separated into two items by a TAB (» denotes a TAB, in FM \x08)
[[Garneau, 1990 #12]]    »   [9]
The right item may also be locally formatted text, e.g. [9]
Then follow the same (or smaller) number of paragraphs with formatted text like this:
[9] » D. Garneau, Ed., National Language Support Reference Manual (National language Information Design Guide. Toronto, CDN: IBM National Language Technical Centre, 1990.

Is it possible to replace in the body of the function below the following piece

  while(pgf.ObjectValid()) {
    pgfText = GetText (pgf, newDoc);
    gaBibliography.push(pgfText);
    pgf = pgf.NextPgfInFlow;
  }

with this

  while(pgf.ObjectValid()) {
    gaBibliography.push(pgf);
    pgf = pgf.NextPgfInFlow;
  }

Do I need a special declaration of the array gaBibliography ?
And how to get the right part of the intro lines as formatted thingy into array gaFmtCitsFmt ?

Currently I read into arrays only the 'strings' (function GetText not shown):

var gaFmtCitsRaw  = [];                           // left column in processed RTF
var gaFmtCitsFmt  = [];                           // right column in processed RTF
var gaBibliography= [];                           // bibliography lines from processed RTF
// filename is something like E:\_DDDprojects\FM+EN-escript\FM-testfiles\BibFM-collected-IEEE.rtf

function ReadFileRTF (fileName) {
  var nCits=0, nBib = 0, openParams, openReturnParams, newDoc, pgf, pgfText ;
  var TAB = String.fromCharCode(8);               // FM has wrong ASCI for TAB
  var parts = [];
 
  openParams = GetOpenDefaultParams();
  openReturnParams =  new PropVals(); 
  newDoc = Open (fileName, openParams, openReturnParams); 
  pgf = newDoc.MainFlowInDoc.FirstTextFrameInFlow.FirstPgf;  // get first pgf in flow

// --- read the temp/formatted citations 
  while(pgf.ObjectValid()) {
    pgfText = GetText (pgf, newDoc);
    if (pgfText.substring (0,2) == "[[") {        // citation lines start with [[
      parts = pgfText.split(TAB);                 // get the two parts of the line
      gaFmtCitsRaw.push (parts[0]);               // Push the result onto the global array
      gaFmtCitsFmt.push (parts[1]);
      pgf = pgf.NextPgfInFlow;
    } else { break }
  }

// --- read the bibliography
  while(pgf.ObjectValid()) {                      // until end of doc
    pgfText = GetText (pgf, newDoc);
    gaBibliography.push(pgfText);
    pgf = pgf.NextPgfInFlow;
  }
  newDoc.Close (Constants.FF_CLOSE_MODIFIED);
} // --- end ReadFileRTF

The next questions then will be how to modify Ian Proudfoot's FindAndReplace script to handle formatted text as replacement. IMHO i will need to use copy/paste ...

TOPICS
Scripting

Views

1.2K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Mar 12, 2015 Mar 12, 2015

Klaus, OK, before you paste, you need to set the TextSelection to your insertion point.

// Add a new paragraph after the current paragraph. 

var newPgf = oDoc.NewSeriesPgf (lastPgf);

var textRange = new TextRange (new TextLoc (newPgf, 0), new TextLoc (newPgf, 0));


oDoc.TextSelection = textRange;

oDoc.Paste ();

-Rick

Votes

Translate

Translate
Community Expert ,
Mar 10, 2015 Mar 10, 2015

Copy link to clipboard

Copied

Hi Klaus, You can push paragraph objects into an array without a special declaration. I am pressed for time, but will try to look at the rest of your question later. -Rick

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Mentor ,
Mar 11, 2015 Mar 11, 2015

Copy link to clipboard

Copied

Klaus, I would suggest that copy/paste might be the easiest way. However, I would not suggest that it is 100% reliable. Usually, I think, but I would not bet on it.

The alternative is to query the text range of each paragraph for any format changes, store each set of properties from the original, then iterate over the new text and reapply. You can find out where formatting changes occur with something like:

textItems = doc.GetTextForRange (textRange, Constants.FTI_CharPropsChange);

Now, I realize this doesn't tell you much and the truth is that it is a complicated concept. I would have to spend all day writing about it, because you need an intimate knowledge of text ranges and text item structures to make it work. Obviously, I can't do that.

What I can do is provide a working sample that shows the concept, although for a somewhat different application. I ran into this same type of issue with a script that applies character formatting, where I wanted to have an Undo feature as well. In order to accomplish an undo, I have to effectively remember the original formatting of the entire text snippet where the new formatting was applied. This is similar to what I think you want... to remember (and reapply) the original formatting of text snippets from the imported RTF content. If you are interested, go here and get the script called ADVANCED_Create_formatting_shortcuts.jsx:

FrameMaker ExtendScript Samples - West Street Consulting

Then, look up the following functions:

CaptureChrFormatUndoSnapshot()

UndoChrFormatApply()

Please accept the disclaimer that this is a complicated concept embedded within a complicated script. I hope it can be of some assistance.

Russ

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 11, 2015 Mar 11, 2015

Copy link to clipboard

Copied

Thanks to Rick and Russ for the intitial feedback. Russ, Your example is really complicated, but thanks to your extensive comments I should get at least some insight.

My major problem seems to be the understanding of textrange.

- How can I 'grab' a full paragraph?

- How can I 'grab' a part of a paragraph, such as the part behind the first TAB character?

I know that You all do not have much time - in particular compared with me as a retired person. I hope to be patient enough for You. I'm experimenting a lot to enhance my knowledge - mainly based on examples from others.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Mentor ,
Mar 11, 2015 Mar 11, 2015

Copy link to clipboard

Copied

Klaus,

Working with text is about the most complicated thing to do within FrameMaker. It seems counter-intuitive, since it is about the easiest thing to do with the GUI. But alas, once you remove the ability to select with a mouse and type with a keyboard, text becomes a wild jungle of complexity.

Text ranges are not too bad, once you get the general idea. It is just that... a range of text, like something you would select with a mouse. Like a mouse selection, it starts before some character in some paragraph and ends after some character in some paragraph. It may be the same paragraph, which is a selection within a paragraph. The character can even be the same, which is then just an insertion point (cursor) somewhere.

So, a text range is a data structure that defines two paragraphs and two characters. In the jargon of scripting, the character is called an "offset." An offset is simply the number of characters past the beginning of said paragraph, where 0 is the beginning.

For example, if you want to capture the first five characters of a paragraph as a text range, you can do this, where 'pgf' is some paragraph object:

var textRange = new TextRange();

textRange.beg.obj = pgf;

textRange.beg.offset = 0;

textRange.end.obj = pgf;

textRange.end.offset = 5;

If you want to capture a whole paragraph, change that last line to the number of characters in the pgf, or you can do this:

textRange.end.offset = Constants.FV_OBJ_END_OFFSET;

...where that constant is just some built-in thing that means "get me to the end of whatever." It's a convenience of the interface.

I'll also note that a text range is actually just an array of two text location structures, one named 'beg' and one named 'end.' If you think of a text location as defined by paragraph and an offset from the first character, maybe that will make more sense.

Text item structures are a whole new mess of complexity. I can't possibly go into an explanation of them here.

I think that many ES developers (definitely myself included) still use the FDK documentation because it is considerably more comprehensive. The two interfaces are largely parallel, but of course somewhat different in the language syntax. Consider that as a potential resource.

Russ

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 11, 2015 Mar 11, 2015

Copy link to clipboard

Copied

Thank You Russ for Your explanations.

Working with text is about the most complicated thing to do within FrameMaker

Oh Lord, and FM is all about handling text ...
I understand the concept of a text range - and have already fiddled around with various types of text items (including concatinating them if they are strings).
But: how to convert a text range into a selection (e.g. for copying it into the clipbaord)?

FDK says about SetTextRange: «Set the text selection or insertion point by setting the property that specifies the text selection»

var oDoc = app.ActiveDoc;
var pgf  = oDoc.MainFlowInDoc.FirstTextFrameInFlow.FirstPgf;
var lastPfg = oDoc.MainFlowInDoc.FirstTextFrameInFlow.LastPgf;

   var tr = new TextRange();                      //get text selection for paragraph            
   tr.beg.obj = tr.end.obj = pgf;
   tr.beg.offset = 0;
   tr.end.offset = Constants.FV_OBJ_END_OFFSET;   
// var sel = oDoc.TextSelection(pgf);             // Err: TextRange() is not a function ???

oDoc.Copy();             // Docu says: Copies the current selection to the FrameMaker Clipboard
alert ("what's in the clibboard?");               // from outside FM

// Add a new paragraph after the current paragraph. 
var newPgf = oDoc.NewSeriesPgf (lastPfg);         // OK
  oDoc.Paste();                                   // only non FM stuff is pasted

In line 16 something is pasted, if the clipboard contains only text (from outside of FM). If I manually select something in the doc and run the script, nothing is pasted.

If I empty the clipboard and run this snippet, nothing is pasted (and ClipBoardInspector does not see anything).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 11, 2015 Mar 11, 2015

Copy link to clipboard

Copied

Hi Klaus, Before you copy the text range, you have to select it first. -Rick

oDoc.TextSelection = tr;

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 12, 2015 Mar 12, 2015

Copy link to clipboard

Copied

Thank You very much, Ric.

I had the following statement, but got a strange error message and hence removed it ...

var sel = oDoc.TextSelection(pgf);   // Err: TextRange() is not a function ???

So now I get closer peu à peu:

var gaText = new Array ();
var oDoc = app.ActiveDoc;
var pgf  = oDoc.MainFlowInDoc.FirstTextFrameInFlow.FirstPgf;
var lastPfg = oDoc.MainFlowInDoc.FirstTextFrameInFlow.LastPgf;

  var tr = new TextRange();                       //get text selection for paragraph           
  tr.beg.obj = tr.end.obj = pgf;
  tr.beg.offset = 0;
  tr.end.offset = Constants.FV_OBJ_END_OFFSET;  

  oDoc.TextSelection = tr;
  oDoc.Copy();                                    // Clipboard OK (rtf)

  gaText.push (tr);                               // not correct object

// Add a new paragraph after the current paragraph.
  var newPgf = oDoc.NewSeriesPgf (lastPfg);       // OK

// oDoc.Paste(Constants.FF_INTERACTIVE);          // no dialogue
// error = oDoc.Paste(0);                         // nothing pasted

  var textLoc = new TextLoc (newPgf, 0); 
  oDoc.AddText (textLoc, gaText [0]);             // [object TextRange]

I had no success at all with the Copy method (see the comments) and hence jumped further to what I really need: have the stuff in an arry and place it again.

But I do not have the correct object on the array. The text inserted is [object TextRange] - how to get the contents of this text range?

Replacing line 23 by

oDoc.AddText (textLoc, oDoc.GetTextForRange(gaText [0]));

Inserts nothing - but also no error is reported.

Do I need to handle TextItems here ?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 12, 2015 Mar 12, 2015

Copy link to clipboard

Copied

Hi Klaus, Here is a utility function that will get text from a text range or text object (Pgf, TextFrame, TextLine, SubCol, Cell, etc.):

function getText (textObj, doc) {

    // Gets the text from the text object.

    var text = "";

    // Get a list of the strings in the text object or text range.

    if (textObj.constructor.name !== "TextRange") {

        var textItems = textObj.GetText(Constants.FTI_String);

    } else {

         var textItems = doc.GetTextForRange(textObj, Constants.FTI_String);

    }

    // Concatenate the strings.

    for (var i = 0; i < textItems.len; i += 1) {

        text += (textItems.sdata);

    }

    return text; // Return the text

}

Then line 14 in your code should be:

gaText.push (getText (tr, oDoc));

As far as the copying/pasting, I have had some success with ExtendScript. It basically works like this:

// Get your text range from some where and make sure it is selected.

doc.TextSelection = tr;

// Push the current contents of the clipboard onto the clipboard stack so it can be restored later.

PushClipboard ();

// Copy the selected text.

doc.Copy ();

// Get the target text range or text location and select it.

// This example shows a location at the beginning of a paragraph.

targetTr = new TextRange (new TextLoc (pgf, 0), new TextLoc (pgf, 0));

doc.TextSelection = targetTr;

// Paste the text from the clipboard.

doc.Paste ();

// Restore the original clipboard contents.

PopClipboard ();

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 12, 2015 Mar 12, 2015

Copy link to clipboard

Copied

Again, Ric, thank You very much for the time you spend for me.

The GetText () routine is already in heavy use in my script - but here I need to find a method to place already formatted text. The format of bibliographic citations vary - and they may contain italic, bold, underline and even expont as emphasis.

D. Garneau, Ed., National Language Support Reference Manual (National language Information Design Guide. Toronto, CDN: IBM National Language Technical Centre, 1990.

The first paragraph which I read with the test script cotains a word in italics - and of course this format is not  transported with the GetText () function. I need to conserve the formatting to be able to replace the temporary citations with the formatted ones:

from [[Garneau, 1990 #12]] To [9] to the above mentioned paragraph. If I have all these elements (from the imported RTF) in arrays, I can use indizes and do not need to loop through paragraphs to find them...

This finally found paragraph then replaces the temporary citations in FM footnotes.

I also need to figure out what to do in Russ Wards SearchReplace script to handle this. Maybe I must find the equivalent to "Replace by Paste". And that's the reason why I was experimenting with Paste also. But in my tests oDoc.Paste() or oDoc.Paste(0) do nothing, And I remeber Russ' note about the unreliability of copy/paste...

I didn't think that text handling is that complicated in Escript...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Mentor ,
Mar 12, 2015 Mar 12, 2015

Copy link to clipboard

Copied

Klaus,

Copy/paste should work. When I question the reliability, I don't mean that the basic functionality is unreliable. I'm just saying that I would not always trust it to absolutely maintain the original formatting. And, that's just because FrameMaker is kind of designed to resist format overrides. That said, normally a copy/paste will normally maintain format overrides. I just wouldn't bet my piano on it.

If your paste action is doing nothing, I would suggest that you are doing something wrong. Either:

- There is nothing on the clipboard, because you did not properly select some text before executing Copy()

- Your insertion point for the paste is an invalid location.

The way you test this is to combine it with manual actions. For example, run just the Copy() operation alone, then manually try to paste the content somewhere. If it doesn't work, you know the problem is with the copy attempt. If it does, then try the same thing with the paste. Manually copy something, then run a script that goes straight to the paste. The clipboard is all the same... so you just have to iteratively step through the actions while troubleshooting. Lots of ES troubleshooting follows this methodology, since there is really no way to "see" what is happening inside the app.

Russ

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 12, 2015 Mar 12, 2015

Copy link to clipboard

Copied

Thanks Russ for this advice. The first part (fille the clipboard) was OK, I tested the contents of the clipboard with a clipboard-inspector utility. Neverless I tested again:

var oDoc = app.ActiveDoc;
var pgf  = oDoc.MainFlowInDoc.FirstTextFrameInFlow.FirstPgf;
var lastPfg = oDoc.MainFlowInDoc.FirstTextFrameInFlow.LastPgf;

  var tr = new TextRange();                       //get text selection for paragraph            
  tr.beg.obj = tr.end.obj = pgf;
  tr.beg.offset = 0;
  tr.end.offset = Constants.FV_OBJ_END_OFFSET;   

  oDoc.TextSelection = tr;
  oDoc.Copy();                                    // Clipboard can be pasted manuall

// Add a new paragraph after the current paragraph. 
  var newPgf = oDoc.NewSeriesPgf (lastPfg);       // OK
  var textLoc = new TextLoc (newPgf, 0);          // cursor nowhere 

Run script
=> new para at end
Put cursor therein and paste => OK
... OK all the time

However, the second part (paste) is quirky - don't know where the problem really is. Since on my system the ESTK has no connection to FM (since FM-10) I tested on the system of my wife - but the same effects there:

var oDoc = app.ActiveDoc;
var lastPfg = oDoc.MainFlowInDoc.FirstTextFrameInFlow.LastPgf;

// Add a new paragraph after the current paragraph. 
  var newPgf = oDoc.NewSeriesPgf (lastPfg);       // OK
  var textLoc = new TextLoc (newPgf, 0); 
 
oDoc.Paste ();                         // random results

Copy the first paragraph manually
Run script
=> new para at end, nothing pasted into
copy again manaully
run script
=> new para at end, nothing pasted into
copy again manaully
run script
=> new para at end, nothing pasted into
run script (with old clipboard contents)
=> new para at end, pasted as second para
run script (with old clipboard contents)
=> new para at end, pasted as third para

Is this black or white magic?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 12, 2015 Mar 12, 2015

Copy link to clipboard

Copied

Klaus, OK, before you paste, you need to set the TextSelection to your insertion point.

// Add a new paragraph after the current paragraph. 

var newPgf = oDoc.NewSeriesPgf (lastPgf);

var textRange = new TextRange (new TextLoc (newPgf, 0), new TextLoc (newPgf, 0));


oDoc.TextSelection = textRange;

oDoc.Paste ();

-Rick

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Mentor ,
Mar 12, 2015 Mar 12, 2015

Copy link to clipboard

Copied

I'll also mention that text locations/ranges are a little weird with the last paragraph in the flow. Something to do with the end-of-flow mark, or something else, I don't know. Any time I build a document paragraph by paragraph, I always leave that last paragraph as empty, then insert next-to-last paragraphs and build within those. Then at the end, just delete the last empty paragraph. I find it much cleaner.

Russ

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 13, 2015 Mar 13, 2015

Copy link to clipboard

Copied

LATEST

Thank You both! This part now works.

// Copy first paragraph to the end of flow
var oDoc = app.ActiveDoc;
var pgf  = oDoc.MainFlowInDoc.FirstTextFrameInFlow.FirstPgf;
var lastPfg = oDoc.MainFlowInDoc.FirstTextFrameInFlow.LastPgf;

  var tr = new TextRange();                       //get text selection for paragraph            
  tr.beg.obj = tr.end.obj = pgf;
  tr.beg.offset = 0;
  tr.end.offset = Constants.FV_OBJ_END_OFFSET;   

  oDoc.TextSelection = tr;                        // get the paragraph
  oDoc.Copy();                                    // Clipboard contains formatted stuff

  var newPgf = oDoc.NewSeriesPgf (lastPfg);       // Add a new paragraph at end of flow
  var textLoc = new TextLoc (newPgf, 0); 
  var textRange = new TextRange (textLoc, textLoc);
  oDoc.TextSelection = textRange;                 // set TextSelection to insertion point

  oDoc.Paste ();                                  // "replace" TextSelection

Steep learning curve!

Mind boggling concept (or limited brains)

One little step - but for walking every step has equal importance.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines