9 Replies Latest reply on Aug 13, 2013 11:06 AM by Vamitul

    Big performance issue while removing tabs by indents

    Loic.Aigon Adobe Community Professional

      Hi guys,

       

      I am having an issue with an algorytm of mine. Although, it works as expected, it can takes up to 6 hours to execute on certain files.

       

      Here is the thing. I need to defined areas where there are "dotted blanks"

      Those dots can have myriad of constructions but it's not the debate here. But among all the possibilities one may be a succession of tabs like in the following snapshot. So some tabs may interest me, others don't. In order to ease my identification process, I wanted to remove useless tabs by indents and keep only useful ones. I call useless tabs the one that are not "dotted" and located either on the beginning of the sentence or in the end.

       

      Exemple:

       

      http://s9.postimg.org/djjgr920f/Capture_d_e_cran_2013_08_11_a_12_52_15.png

       

      What I need to get

       

      http://s15.postimg.org/b52h5bcvv/result.png

       

      So in my case :

       

      TAB TAB(dotted) TAB RETURN

       

      The first tab and the last one are no interest to me, so I replace theme by indents :

       

      INDENT TAB(dotted) INDENT RETURN

       

      As I sad this process works as expected but it can take several hours to execute on certain files. So what I would like to know is :

       

      • Am I doings things wrong ( bad algorithm construction ) ?
      • Is this time uncompressible ?
      • Do you see another approach ?

       

      Here is the problematic method :

       

       

      replaceFirstAndEndTabByIndent:function(doc)
                          {
      
      // IDUtils.findChangeGrep(doc,{findWhat:"^\\t+"} => return tabs;
      
                                    var firstTabs = IDUtils.findChangeGrep(doc,{findWhat:"^\\t+"},undefined,false);
                                    var tabs, tab, exit = false, leftShift = 0, rightShift = 0, p, ptf,
                                    paras = [], tabsToRemove = [], rem, tss, ts, xMax;
      
                                          //looping through beginning tabs
                                          //if not dotted, increment indentation value then finally apply to paragraph
                                    while ( tabs = firstTabs.pop() )
                                    {
                                              leftShift = rightShift = 0;
      
                                              p = tabs.paragraphs[0];
                                             
                                                      // if paragraph has bullet or may be looking so, we ignore it
                                              if ( AppLib.paragraphContainsBullet ( p ) ) continue;
      
      
                                              ptf = p.parentTextFrames[0];
      
                                              tabs = tabs.characters.everyItem().getElements();
      
                                              while ( tab = tabs.shift() )
                                              {
                                                        if ( !this.analyzeItem(tab) )
                                                        {
                                                                  leftShift+=Math.abs( tab.endHorizontalOffset-tab.horizontalOffset);
      
                                                                  tabsToRemove.push ( tab );
      
                                                                  if ( tabs.length ==0 ) paras.push ( {p:p, leftIndent:leftShift, ptf:ptf} );
                                                        }
                                                        else
                                                        {
                                                                  tabs.unshift ( tab );
                                                                  paras.push ( {p:p, leftIndent:leftShift, ptf:ptf} );
                                                                  break;
                                                        }
                                              }
      
                                                       //replace all final non dotted tabs by a matching indentation
                                              while ( tab = tabs.pop() )
                                              {
                                                        if ( !this.analyzeItem(tab) )
                                                        {
                                                                  rightShift+=Math.abs( tab.endHorizontalOffset-tab.horizontalOffset);
      
                                                                  tabsToRemove.push ( tab );
      
                                                                  if ( tabs.length ==0 ) paras.push ( {p:p, rightIndent:rightShift, ptf:ptf} );
                                                        }
                                                        else
                                                        {
                                                                  var maxSpace = Math.abs ( p.parentTextFrames[0].visibleBounds[3]-tab.endHorizontalOffset );
                                                                  rightShift = rightShift > maxSpace ? maxSpace : rightShift;
                                                                  paras.push ( {p:p, rightIndent:rightShift, ptf:ptf} );
                                                                  break;
                                                        }
                                              }
                                    }
      
                                    tabsToRemove  = tabsToRemove.reverse();
      
                                      //we now apply indentations as computed
                                    while ( p =  paras.pop() )
                                    {
                                              try
                                              {
                                                        if ( !p.p.isValid ) continue;
                                                        if ( p.leftIndent ) p.p.leftIndent += p.leftIndent;
      
                                                        if ( p.rightIndent )
                                                        {
                                                                  p.p.rightIndent += p.rightIndent;
      
                                                        }
                                                        //*
                                                        tss = p.p.tabStops.everyItem().getElements();
                                                        if ( p.p.parent instanceof Cell )
                                                        {
      
                                                        }
                                                        else
                                                        {
                                                                  xMax = Math.abs ( p.ptf.visibleBounds[3] - p.ptf.visibleBounds[1] ) -p.p.rightIndent;
      
                                                                  while ( ts = tss.pop() )
                                                                  {
                                                                            if ( ts.position > xMax )
                                                                            {
                                                                                      ts.position = xMax;
                                                                            }
                                                                            else
                                                                            {
                                                                                      break;
                                                                            }
                                                                  }
                                                        }
                                                        //*/
                                              }
                                              catch(e)
                                              {
                                              }
                                    }
                                      //Now removing undesired tabs
                                    while ( rem = tabsToRemove.pop() )
                                    {
                                              rem.remove();
                                    }
                          },
      
      

       

      It does work with reasonable times with certain files (10/15secs at max) but for certains I recorded times and it took almost 6 hours :\

       

      Any ideas will be become

       

      TIA

       

      Best,

       

      Loic

        • 1. Re: Big performance issue while removing tabs by indents
          UQg Level 4

          Hello,

          i know 0 (zero) about inDesign so this is more a comment, not a solution.

           

          Array.prototype methods are 'not so fast', especially shift, unshift, splice, so it might useful to split the work in smaller chunks to deal with smaller arrays.

           

          Maybe add an optional parameter to your IDUtils.replaceFirstAndEndTabByIndent and IDUtils.findChangeGrep functions, something like {from: ..., blocksize: ...}, then choose a blocksize and loop over 'from' to cover the whole doc, instead of treating the whole doc at once.

          As i said, i dont know inDesign so i have no idea what blocksize could be (10 pages, 50 paragraphs, 100 lines ???).

           

          And also you apparently dont need to store all these tabs, so a function that does things on the fly would probably be faster that a pair {one that store / one that does something with stored values}.

           

          Just by curiosity, could you give the length of the initial array firstTabs for a doc that requires 6 hours computing time ?

          Sorry if this answer didnt hit

          Xavier.

          • 2. Re: Big performance issue while removing tabs by indents
            Loic.Aigon Adobe Community Professional

            Hi UQg,

             

            First of all, thanks for the interest whatever your indesign knowledge is

             

            Second of all, to answer your curiousness : I get 122 tabs in my firstTabs array;

             

            Last of all, your proposal may worth the try. I do indeed run a grep search among all the document. Maybe I should run this quest text frame by textframe. I will try this and hopefully have better results.

             

            Thanks again,

             

            Best,

             

            Loic

            • 3. Re: Big performance issue while removing tabs by indents
              Vamitul Level 4

              Hi Loic!

               

              Looking through your code, there doesnt seem to be any major preformance traps. The array methods are slow, but not that slow to justify such a huge drop in performance.

               

              Two things i can think of:

              1) Somewhere in your script (mabye in AppLib?) you are doing some serious DOM access. The collection's methods nextItem() and previousItem() are incredibly slow, but all DOM methods take a performance hit that increases exponentialy with the size of the document.

               

              2) see if you can remove the try/catch.. just having it in your code is a major performance cost.

               

              also.. see if UndoModes.fastEntireScript works for you. (warning, warning warning, major cause of bugs).

              • 4. Re: Big performance issue while removing tabs by indents
                Trevorׅ Adobe Community Professional

                Hi Loic,

                 

                Here's few tips that could help out quite a bit, but I'm sure you know them yourself.

                 

                Besides changing the lines like paras.push to paras[z++] =

                and

                p =  paras.pop()

                to

                l = paras.length;

                while (l--)

                {

                     p = paras[l];

                      ......

                }

                paras = null;

                 

                I have found that on large arrays adding slice(0) ......everyItem().getElements().slice(0); can make a big difference.

                 

                See also Ariel's tip here about splitting into smaller parts

                 

                Depending on the version of indesign that you are working with however you can try and use the FAST_ENTIRE_SCRIPT undo mode as long as you set your grep outside this FAST_ENTIRE... mode.

                 

                Using redraw = false can make a big difference.

                 

                How many pages is your 6+ hour document? And does it have a lot of diacritics?

                If it has a lot if diacritics then set the findGrep to ignore them.

                Let us know if this helps.

                 

                Regards,

                 

                Trevor

                 

                P.s. if on windows look at the resource monitor to see if indesign is waiting for some other process

                • 5. Re: Big performance issue while removing tabs by indents
                  Loic.Aigon Adobe Community Professional

                  Hi Vamitul, Trevor,

                   

                  Thanks a lot for your inputs. I will try them and let you know

                   

                  Loic

                  • 6. Re: Big performance issue while removing tabs by indents
                    Loic.Aigon Adobe Community Professional

                    Hi Vamitul, Trevor, UQg,

                     

                    I wanted to thank you all. I wish I could tick everyone of you as a correct response. Each one of your tips had helped me to reduce my process from 6 hours to only 90 seconds. It seems that the major issue was dealing with such a large array ( almost 500 elements ). Fact is the processing time gradually increased from a pinch of seconds for the first items to more than a minute towards the 200th element, and growing…

                     

                    To implement UQg tip, I tweaked a function I found in the internet to chunk my array into an array of ten elements array and then proceed ito every items.

                     

                     

                    chunkArray:function(a,s)
                                        {
                                                  var chunks = [];
                                                  var n = a.length;
                    
                                                  if ( a.length < s ) return [a];
                    
                                                  while (n--)
                                                  {
                                                            chunks[chunks.length] = a.splice(0, s);
                                                            n = a.length;
                                                  }
                    
                                                  return chunks;
                                        },
                    
                    

                     

                    All of this put in common seems to tremendously boost performance. Thank you so much to all of you.

                     

                    Loic

                    • 7. Re: Big performance issue while removing tabs by indents
                      UQg Level 4

                      good news ; )

                      6 hours to 90 sec is quite impressive!

                       

                      Just for the records, here is an article i read a while ago on script optimization which i try to keep in mind all the time.

                      Various speed tests using this or that technique, the part 4 (usage of buffers) is particularly relevant to this topic.

                      http://www.webreference.com/programming/javascript/jkm3/index.html

                       

                      Xavier

                      • 8. Re: Big performance issue while removing tabs by indents
                        Trevorׅ Adobe Community Professional

                        Nice article Xavier, Thanks for sharing it.

                        • 9. Re: Big performance issue while removing tabs by indents
                          Vamitul Level 4

                          Xavier, thanks, been looking for that link for a while now (read the article a while back, and lost the link).