3 Replies Latest reply on Nov 23, 2017 11:50 PM by Oriup

sort found digits

Hello All,

I have a question, how to sort all myFound object?

```var myDoc = app.documents[0];
app.findGrepPreferences = null;
app.findGrepPreferences.findWhat = "\\d+";
var myFound = myDoc.findGrep();

myFound.sort(function(a, b) {
return a - b;
});

for(var i=0; i<myFound.length; i++){
}
```

Above code is not working well, please guide me.

Thanks

Sumit

Message was edited by: Sumit Kumar

• 1. Re: sort found digits

It does work, but it does not sort on what you think it's sorting. The return result of findGrep is an array of Text: https://www.indesignjs.de/extendscriptAPI/indesign-latest/#Text.html#d1e450430__d1e457955 and so you sort based on the equation ([object Text] - [object Text]). (Which always returns "NaN", because apparently you cannot subtract one Text object from another.)

If you want to sort on the text contents, do exactly that.

```myFound.sort(function (a, b) {
return a.contents.localeCompare(b.contents);
});
```

which is a text comparison, or

```myFound.sort(function (a, b) {
return Number(a.contents) - Number(b.contents);
});
```

which is a value comparison.

• 2. Re: sort found digits

Hi all,

It's worth noting that invoking Text.contents from within a sort function is critically time consuming.

The reason is the “5×N×Log(N) rule” (see Indiscripts :: Reconsidering Array.sort( ) in ExtendScript/JavaScript — Part 1 ) which tells us that, on average, if you have say 1,000 objects in the Array you want to sort, the compare function will be invoked ~15,000 times. So, as each call involves two DOM commands (a.contents and b.contents), this hits the DOM ~30,000 times while only 1,000 objects are actually compared.

I did some tests based on a 5-page documents containing 4,000 sparse digits at random locations. My setup looks like this:

Then my testing protocol has the general form:

```doc = app.properties.activeDocument;
app.findGrepPreferences = null;
app.findGrepPreferences.findWhat = "\\d+";
founds = doc.findGrep(); // `founds.length` is about 4000

\$.hiresTimer;
//-----------
<textSort>(founds);      // Here I use various <textSort> algorithms.
//-----------

```

where <textSort> refers to the sort routine to be used.

To keep things absolutely uniform regarding text-to-number conversion, I used a generic VALUE function which returns the numeric value from any captured Text. It is based on parseInt(…) (since the parsed strings are supposed to represent integers), but parseFloat(…) or Number(…) could do the job as well.

```const VALUE = function(/*Text*/tx)
{
return parseInt(tx.contents,10)
};

```

1. The naive approach (our benchmark) is as follows:

```const textSort1 = function(/*Text[]&*/A,  f)
//----------------------------------
// 1. Original method.
{
f = callee.sorter;
A.sort( f );
};

textSort1.sorter = function(/*Text*/x,/*Text*/y)
{ return VALUE(x) - VALUE(y) };

```

It takes the Text array and applies the original sort method, which simply returns VALUE(x)-VALUE(y) for each x,y items. Keep in mind that any call to VALUE implies the Text.contents command. So we get here catastrophic results (of the order of ten seconds on my platform.)

2. How to improve textSort1? The problem with Text instances is, they have no ID, so you cannot easily separate the DOM-access side from the calculation side. However, a well-knows fact is that toSpecifier() is faster than any other method. So it is relevant to access Text.contents only N times and to store the result in a map based on specifiers. Thus, the sort routine only invokes toSpecifier() and use the proxy map for comparing data. Here is the snippet:

```const textSort2 = function(/*Text[]&*/A,  f,q,n,i,t)
//----------------------------------
// 2. Slight optimization thru a map. ABOUT 3X FASTER
// [REM] `toSpecifier()` is faster than `contents`.
{
f = callee.sorter;
q = f.Q = {};
for( n=A.length, i=-1 ; ++i < n ; )
{
t = A[i];
q[t.toSpecifier()] = VALUE(t);
}
A.sort( f );
};

textSort2.sorter = function(/*Text*/x,/*Text*/y)
{ return callee.Q[x.toSpecifier()] - callee.Q[y.toSpecifier()] };

```

According to my tests, this method is ~3X faster than the original. Note that VALUE is called only from the linear loop (textSort2). The compare function only wants to access Text.toSpecifier(), but it stills does it 5×N×Log(N) times. The DOM is still very much in demand, although the algorithm runs significantly faster.

3. Finally, the solution that sounds to me the best (and the most impressive!) is based, as usual, on entirely removing DOM heat from the compare function. Even better, it just uses the native Array.sort() with no callback—which has been shown to be extremely efficient in optimization issues. The trick is based on a simple idea: find a way to store everything (computed numbers and indices) in terms of orderable UTF16 strings. Save those data in a proxy array and sort it. Then, recover the original indices and apply a merge sort on the original Text array.

The result now comes out within 250 ms (10X faster than textSort2, 30X faster than the naive method.)

I know, the code requires much more attention, but I think it's a fair illustration of my doctrine, the easiest path is not always the best.

```const textSort3 = function(/*Text[]&*/A,  q,n,i,v,t,j,k)
//----------------------------------
// 3. Using a proxy array (of strings.) ABOUT 30X FASTER
{
const CHR = String.fromCharCode;
q = Array(n=A.length);

// Proxy array.
// ---
for( i=-1 ; ++i < n ; )
{
v = ''+VALUE(A[i]);
q[i] = CHR(1+v.length) + v + CHR(1+i);
}

q.sort(); // default sort (very fast!)

// Reorder A according to q. (In-place merge sort.)
// ---
while( i-- )
{
if( 'string' != typeof(k=q[i]) ) continue;
t = A[i];

for( j=i ; 1 ; (A[j]=A[k]),(j=k),(k=q[j]) )
{
'string' == typeof k && (k=-1+k.charCodeAt(-1+k.length));
q[j] = j;
if( k == i ) break;
}

A[j] = t;
}
};
```

@+

Marc

• 3. Re: sort found digits

Hi Marc,

You are absolutely right, time is so important.

But I could not understand your code how to use it, in my case 14000 object to sort.

Could you elaborate it, where to put my myFound variable to get sorter variable?

Jongware code is easily understandable for me.

Thanks

Sumit