Copy link to clipboard
Copied
Hi all.
I have the following task: I need to translate a document into another language using ExtendScript. So, as "input" I have a document with a text/graphics/tables/etc. in Language_1 and a "somehow-separated file", which will contain data about translation into the Language_2. E.g.:
Some_text_in_language_1 Some_text_in_language_2
Some_other_text_in_language_1 Some_other_text_in_language_2
To get the source text from the document, I've tried to use this:
var pgf = doc.MainFlowInDoc.FirstTextFrameInFlow.FirstPgf;
while(pgf.ObjectValid()){
var test = pgf.GetText(Constants.FTI_String);
var text, str;
text = "";
for (var i=0; i < test.len ; i +=1)
{
var str=test .sdata.replace(/^\s+|\s+$/g, '') ;
text = text + str;
PrintTextItem (test);
}
pgf = pgf.NextPgfInFlow;
}
But with this, I can only access the regular text in the document (e.g. the text in tables remains untougched). Is there any way I can the all textual data from specified document? Or maybe, the full list of controls, which can contain it, to iterate throught them and extract it one-by-one? Or maybe there's a better way to solve this problem?
Thanks in advance! Any advice would be greatly appreciated.
Copy link to clipboard
Copied
Hi,
GetText delivers an array of text items.
Text items could be text but also table anchors, markers etc.
You'll never get text of a table if you call pgf.GetText(Constants.FTI_String).
If you want to have text and table, you have to call
var textItems = pgf.GetText(Constants.FTI_String | Constants.FTI_TblAnchor);
After that you have to loop through the textitems, and check for table anchors. Then you can get text from that table resp. table cells.
If you want to have all kind of text item types, you can call
var textItems = pgf.GetText(-1);
Hope this helps
Markus
Copy link to clipboard
Copied
Thanks a lot for the answer, Markus. Indeed, the "-1" seems like my salvation
Just to clarify one thing: with this construction, I get all the textual data in a straight way also as though anchors. In my test document, I've noticed only table anchors. Is there any other elements, that can contain text and will be returned by this construction as anchors?
Copy link to clipboard
Copied
There is another way to loop through ALL paragraphs in a document, regardless whether they are in a table or in the main text flow. You can use the FirstPgfInDoc property of the document and loop through all Pgf objects using the NextPgfInDoc property of the Pgf until you reach an invalid object. Note that this also includes all paragraphs in the master and reference pages, so it might be useful to check where the Pgf is located (on a body page or not). There is a script on this forum that does that - I believe it was created and posted by Rick Quatro.
Working your way through the main text flow does not guarantee that you have all the visible text in the doc. There may be multiple flows and there may also be text frames that are placed inside anchored frames. Those text frames are not contained directly in the main flow of the document.
Good luck with your scripting
Jang
Copy link to clipboard
Copied
Thanks for response, Jang.
Working your way through the main text flow does not guarantee that you have all the visible text in the doc. There may be multiple flows and there may also be text frames that are placed inside anchored frames. Those text frames are not contained directly in the main flow of the document.
Wow, that's frustrating I feel like I'm trying to dig a ground with a spoon. Well, that's what was the reason, why I've posted my task. Maybe you could give an advice on alternative way to achieve this goal?
Copy link to clipboard
Copied
Just to clarify one thing: with this construction, I get all the textual data in a straight way also as though anchors. In my test document, I've noticed only table anchors. Is there any other elements, that can contain text and will be returned by this construction as anchors?
markers, cross references, variables, footnote, hypertext, equations, text insets, call outs placed on graphics like textframes or text lines.
some hints.
Table title you will get from table object with property "FirstPgf".
markers have a property "MarkerText"
for xrefs you have to get xrefformat an the definition there.
for variables you have to get the variable format and the definition there
for equations you have to get the MathFullForm property.
and so on.
Copy link to clipboard
Copied
BTW: you can use Save as XML (in a unstructured document, too).
So you will have your content in the text flow in the xml file and can process that with an very easy XSLT Stylesheet.
Be aware: not all objects (markers a.s.o) are exported to xml in the standard way, as I can see.
Markus
Copy link to clipboard
Copied
Okay, thanks again to all for your comments.
I realized that the problem was in my approach. And I ended up with scripting the file translation based on "Find/Replace" function (decided to notice for the ones, who will face same problem).
Good luck!