38 Replies Latest reply on Oct 16, 2013 2:42 AM by Marc Autret

chronological order number needs in index numbers to be changed as ndash

Hi all

I have a task to complete the below requirement for Index part in a book. Please help me.

I have sequence of numbers like this,

Index1, 26, 35, 36, 37, 47
Index2, 65, 78, 79, 89, 90

I need to change like this

Index1, 26, 35−37, 47
Index2, 65, 78−79, 89−90

i.e., the number which are in sequence order (chronological order) needs to be changed as ndash.

Sajeev

• 1. Re: chronological order number needs in index numbers to be changed as ndash

Thanks for the fun morning algorithm exercise...

I think this will do it for you.

Regards

Bob

indexize = function( ar ) {
var out = new Array();
for ( var i = 0; i < ar.length; i++ ) {
var a = i + 1;
var current = ar[ i ];
var concat = false;
while ( parseInt( ar[ a++ ] )== ( current + 1 ) ) {
current = ar[ a - 1 ];
concat = true;
}
if ( concat ) {
if ( parseInt( ar[ i ] ) + 1 == current ) {
out.push( ar[ i ] );
out.push( ar[ i + 1 ] );
} else {
out.push( ar[ i ] + "-" + current  );
}
} else {
out.push( ar[ i ] );
}
i = a - 2;
}
return out;
}

var a = [ 1, 2, 4, 5, 7, 9, 11, 12, 15, 16, 17, 18, 19, 20, 22, 23, 25, 26, 27, 30, 31, 33, 34, 35, 36, 40, 41, 43,44,45,47,48,50 ];
\$.writeln( "Input: " + a );
debugger;
\$.writeln( "Output: " + indexize( a ) );

• 2. Re: chronological order number needs in index numbers to be changed as ndash

Oh, and do you really want 65,78,89,90 to contain 89-90?

It doesn't make sense (to me) to have sequences of 2 digits have a hyphen.

But if you do, make the "if (concat)" block look something like:

out.push( concat ? ( ar[ i ] + "-" + current ) : ar[ i ] );

Bob

• 3. Re: chronological order number needs in index numbers to be changed as ndash

Peter Kahrel has had his part of the fun years ago -- http://www.kahrel.plus.com/indesign/index_update.html

• 4. Re: chronological order number needs in index numbers to be changed as ndash

Thanks Jongaware and Bob

Really nice script, works nicely and thanks for your kind help

Sajeev

• 5. Re: chronological order number needs in index numbers to be changed as ndash

Hi all,

Here is the routine I use in IndexMatic 2 (script-in-progress). That's certainly equivalent to Bob's and/or Peter's approach, but why not sharing:

`function makeSequences(/*Number[]*/numbers, /*String*/separator, /*String*/linker){separator = separator || ", ";linker = linker || "-";if( numbers.length < 2 ) return numbers.join('');var a = numbers.concat().sort(function(x,y){return x-y;}),     sz = a.length-1, i = 0, i0 = 0, n = a[0], r = [];var format = function()     {     r.push(a[i0] + ((i-i0)>1 ? linker+a[i-1] : ''));     return i0=i;     };while( i<sz && ( n+1>=(n=a[++i]) || format()));format();return r.join(separator);}// Sample code:var myTest1 = [ 1, 2, 4, 6, 9, 10, 11, 12, 13, 14, 15, 18, 19, 30 ];alert( makeSequences(myTest1, ", ", "\u2013") );// => 1–2, 4, 6, 9–15, 18–19, 30var myTest2 = [ 5, 3, 4, 4, 6, 6, 9, 12, 1, 18, 21, 22, 22, 0, 8 ];alert( makeSequences(myTest2, " | " ) );// => 0-1 | 3-6 | 8-9 | 12 | 18 | 21-22`

@+

Marc

• 6. Re: chronological order number needs in index numbers to be changed as ndash

Since we're all sharing, here's the function I wrote about two and a half years ago. (If I were writing it now, it would probably look a lot nicer...)

```function FixIndexRanges(){
var story = app.selection[0].parentStory;
var storyWords = story.words;
for(var i=storyWords.length-1;i>0;i--){
var curNumber = parseFloat(storyWords[i].contents);
var prevNumber = parseFloat(storyWords[i-1].contents);
if(curNumber!=NaN && prevNumber!=NaN && prevNumber==curNumber-1){
var insertNumber = curNumber+0;
var theIndex = storyWords[i].characters[0].index-1;
while(prevNumber==insertNumber-1){
i--;
insertNumber--;
try{
var prevNumber = parseFloat(storyWords[i-1].contents);
}catch(e){break}
}
var theIPIndex = storyWords[i].insertionPoints[0].index;
story.texts.itemByRange(storyWords[i].characters[0],story.characters.item(theIndex)).remove();
story.insertionPoints[theIPIndex].contents=insertNumber+'-';
}
}
}

```

Harbs

• 7. Re: chronological order number needs in index numbers to be changed as ndash

cool script  Harbs

really nice one

Sajeev

• 8. Re: chronological order number needs in index numbers to be changed as ndash

It's not the most efficient or nicely designed, but it does the job...

Harbs

http://www.in-tools.com

Innovations in Automation

• 9. Re: chronological order number needs in index numbers to be changed as ndash

Isn't it generally considered stylistically better to make an exception of "teens" -- in other words to leave the digit '1' after the en dash in '12–16' etc. rather than '12–6', but to take out the '2', '3', etc. after the en dash in '22–6', '32–6', etc.?

• 10. Re: chronological order number needs in index numbers to be changed as ndash

Jeremy bowmangraphics wrote:

Isn't it generally considered stylistically better to make an exception of "teens" -- in other words to leave the digit '1' after the en dash in '12–16' etc. rather than '12–6', but to take out the '2', '3', etc. after the en dash in '22–6', '32–6', etc.?

What says "The Chicago Manual of Style" on it?

@+

Marc

• 11. Re: chronological order number needs in index numbers to be changed as ndash
Marc Autret wrote:

> What says "The Chicago Manual of Style" on it?

I just checked, and see that the _Chicago Manual of Style_ has a

rather complicated set of rules for number elision, and hence differs

from the _Oxford Guide to Style_ [OGS] which recommends the "most

succinct" possible, although I don't understand what it means when it

says "do not elide digits in the group 10 to 19, as these represent

sinfgle rather than compound numbers".

I mean, huh? How is '14' (say) "A" digit? And what the heck is a

"single" as opposed to a "compound" number?

I would follow _Chicago Manual of Style_ myself, as it is more widely

consulted, and makes more sense to this little guy's little brain.

Judith Butcher goes with the OGS recommendation, and so does the

British and Irish Society of Indexers, which I took as my guide in my

own (amateurish but working) script a year or two ago. I would not

assume it is a simple American versus British English difference,

however, as Nancy Mulvany (the well-known American indexer) mentions

the OGS recommendation without disdain.
• 12. Re: chronological order number needs in index numbers to be changed as ndash

Jeremy,

The OP wanted to create page ranges (23, 23, 23, 24, 25 > 23-25). What you describe (23-25 > 23 > 5) is abbreviating an existing page range (aka as "elision" and "page dropping"). And when dropping pages, it's indeed the case that you don't drop teens. The reason is that you can't pronounce dropped teens: you can say "twenty-four to eight" meaning 24-28, but you can't say "eleven to fif" or "twelve to eight". There are indeed different styles for this. In Britain, maximin elision is usually preferred (1634-7) while in other countries it is generally avoided or banned from all doublets (1634-37).

Peter

• 13. Re: chronological order number needs in index numbers to be changed as ndash

While we throwing in versions, I recently rewrote that terrible code I did a few years ago and now use this:

function range_pages (str, tolerance)
{
var array = str.split (/,\x20?/);
var temp = "";
var n = 0;
var range = false;
for (var i = 0; i < array.length; i++)
{
temp += array[n];
while (array[n+1] - array[n] <= tolerance)
{n++; i++; range = true}
if (range)
temp = temp + "-" + array[n] + ", ";
else
temp += ", ";
n++; range = false;
}
return temp.replace (/, ?\$/, "");
}

"tolerance" allows for skipping missing numbers in ranges, useful if you're not a.+y retented about 100% coorect coverage. So range_pages ("22, 23, 24, 26, 27, 28", 2) returns "22-28".

Peter

• 14. Re: chronological order number needs in index numbers to be changed as ndash

Peter wrote:

The reason is that you can't pronounce dropped teens: you can say "twenty-four to eight" meaning 24-28, but you can't say "eleven to fif" or "twelve to eight".

Aha! So that's it! I was thinking about the numerals instead of the words for the numbers, saying to myself that the "1" in "15" function the same way as the "2" in "25".

BTW, I've often thought that the rule for dropping the S after the possessive case for "names of antiquity" (we are supposed to talk about Hobbes's Leviathan but Socrates' trial) is simply a matter of pronunciation too: "sokra-teases" sounds silly, and lots of ancient Greek names end with "eases". I guess the test is whether it is better to write "Jesus's disciples" or "Jesus' disciples".

• 15. Re: chronological order number needs in index numbers to be changed as ndash

Marc,

Your function is an elegant brainteaser which I haven't figured out (yet) and the most attractive of the lot posted here (I think -- not the only thing I don't understand that attracts me), but it does have a fault: it ignores the last number of a range at the end of the string:

20, 25, 26 > 20, 25

20, 25, 26, 27, 28 > 20, 25-27

If you replace "sz = a.length-1" with "sz = a.length" it seems to work fine.

Peter

• 16. Re: chronological order number needs in index numbers to be changed as ndash

Thanks a lot, Peter!

I have a bug indeed (!), and my function isn't as efficient as I need it to be, so I'm working on another approach using Array.splice().

Adding the tolerance argument in your function is a brillant improvement, my project needs this option too...

• 17. Re: chronological order number needs in index numbers to be changed as ndash

Peter,

I tried to improve my algorithm using Array.splice, but my benchmark shows that it's definitely not a good solution. The splice method is dramatically sluggish!

Finally, your approach is undisputably the most effective from among the solutions I tested. It obtained good results even with huge arrays. So I restarted following your logic, with some small optimizations and extra options. The code is not very elegant but the performance seems better this way:

`function formatRanges(numbers, separator, joiner, minWidth, tolerance)//----------------------------------------------------------// Formats an array of integers into an ordered sequence of// single numbers and/or ranges. Returns the formatted string.//// <numbers>     Array of Numbers [required]//                    The integers to format. Supports: empty array,//                    unsorted array, duplicated elems, negative values.//// <separator>     String [opt] -- Default value: ", ".//                    A string inserted between each result.//                    Ex.     formatRanges([4,1,3,8,9,6], " | ")//                         => "1 | 3-4 | 6 | 8-9"//// <joiner>          String [opt] -- Default value: "-".//                    A string used to format a range.//                    Ex.     formatRanges([4,1,3,8,9,6], ", ", "_")//                         => "1, 3_4, 6, 8_9"//// <minWidth>     Number [opt] -- Default value: 1.//                    Minimum distance between the 1st and the last//                    number in a range.//                    Ex.     formatRanges([1,2,4,5,6,8,9,10,11], '', '', 1)//                         => "1-2, 4-6, 8-11"//                    Ex.     formatRanges([1,2,4,5,6,8,9,10,11], '', '', 2)//                         => "1, 2, 4-6, 8-11"//                    Ex.     formatRanges([1,2,4,5,6,8,9,10,11], '', '', 3)//                         => "1, 2, 4, 5, 6, 8-11"//// <tolerance>     Number [opt] -- Default value: 0.//                    Number of allowed missing numbers in a range,//                    as suggested by Peter Kahrel (http://bit.ly/cABqIP)//                    Ex.     formatRanges([2,3,5,8,12,17,23], '', '', 1, 0)//                         => "2-3, 5, 8, 12, 17, 23"//                    Ex.     formatRanges([2,3,5,8,12,17,23], '', '', 1, 1)//                         => "2-5, 8, 12, 17, 23"//                    Ex.     formatRanges([2,3,5,8,12,17,23], '', '', 1, 2)//                         => "2-8, 12, 17, 23"{// Defaultsseparator = separator || ", ";joiner = joiner || "-";if( minWidth !== ~~minWidth || minWidth < 1 ) minWidth = 1;if( tolerance !== ~~tolerance || ++tolerance < 1 ) tolerance = 1;// Init.var a = numbers.concat().sort(function(x,y){return x-y;}),     sz = a.length,     n = sz && a[0],     d = sz || false,     i = 0, w = 0, t = 0,     ret = [];// Loopwhile( d !== false )     {     if( 0 === (d=(++i<sz)?a[i]-n:false) )          continue;      // skip duplicates     if( d && (d<=tolerance) )          {          ret.push(n);          n += d;          ++w;          t += (d-1);          continue;          }          if( w >= minWidth )          {          ret.length -= w;          ret.push((n-w-t)+joiner+n);          }     else          {          ret.push(n);          }     n += d;     w = t = 0;     }return ret.join(separator);}`

@+

Marc

• 18. Re: chronological order number needs in index numbers to be changed as ndash

Very interesting -- thanks, Marc. I've also seen sluggish performance using Array.splice(). And now with the results you report it seems almost always more efficient to duplicate things into a temporary second array.

Peter

• 19. Re: chronological order number needs in index numbers to be changed as ndash

Very nice!

That is interesting about splice()...

Harbs

• 20. Re: chronological order number needs in index numbers to be changed as ndash

Is there a reason you are using global variables?

Harbs

• 21. Re: chronological order number needs in index numbers to be changed as ndash

Harbs. wrote:

Is there a reason you are using global variables?

I don't!

• 22. Re: chronological order number needs in index numbers to be changed as ndash

What are these?

`     sz = a.length,     n = sz && a[0],     d = sz || false,     i = 0, w = 0, t = 0,   `
• 23. Re: chronological order number needs in index numbers to be changed as ndash

Ah. Never mind. I did not notice that it was a comma at the end of the previous line...

Harbs

• 24. Re: chronological order number needs in index numbers to be changed as ndash

By the way, here is a snippet I constantly use to detect unintended global variables:

(function(){
var p, a=[];
for(p in this) a.unshift(p);
}).call(this);

I generally place it at the end of a script --in the global scope of course.

You can also use it in an empty script to study the impressive amount of automatic global variables created by the engine before anything happens!

Dirk Beker shows me another trick using the '\$.summary()' undocumented method, very useful in debugging too.

Try this:

And I discovered these ones:

@+

Marc

• 25. Re: chronological order number needs in index numbers to be changed as ndash

Nice!

Harbs

• 26. Re: chronological order number needs in index numbers to be changed as ndash

Very useful, thanks.

• 27. Re: chronological order number needs in index numbers to be changed as ndash

>It doesn't make sense (to me) to have sequences of 2 digits have a hyphen.

There's a difference between "89-90" and "89, 90": "89-90" indicates is a single reference whose discussion begins on p. 89 and ends on p. 90; "89, 90" indicates two separate references. You could therefore have page references like this: "86-89, 90, 91-95". So when you see page ranges like these you can be pretty sure the index was hand made.

This is one of the reasons why automatic page rangers aren't really a good idea. Most people don't mind, though, so I don't mind using them when necessary.

Peter

• 28. Re: chronological order number needs in index numbers to be changed as ndash

I revisited this thing (again) and realised that it could be done more efficiently:

```function page_ranges (array, obj)
{
var temp = [];
var range = false;
for (var i = 0; i < array.length; i++)
{
temp.push (array[i]);
while (array[i+1] - array[i] <= obj.tolerance)
{i++; range = true}
if (range)
temp[temp.length-1] += obj.dash + array[i];
range = false;
}
return temp;
} // page_ranges

// Sample code:
page_ranges ([1,2,3,4,7,8,9,15,17,21,22,23], {tolerance: 0, dash: "-"});```

Peter

• 29. Re: chronological order number needs in index numbers to be changed as ndash

Very nice!

Is there a reason you'd want to use a tolerance of 0?

I like to make function interfaces as easy to use as possible.

This version makes the dash and tolerance optional:

```function page_ranges (array, obj)
{
obj = obj || {};
var temp = [];
var range = false;
var tolerance = obj.tolerance || 0;
var dash = obj.dash || "-";
for (var i = 0; i < array.length; i++)
{
temp.push (array[i]);
while (array[i+1] - array[i] <= tolerance)
{i++; range = true}
if (range){
temp[temp.length-1] += dash + array[i];
}
range = false;
}
return temp;
} // page_ranges```

Harbs

• 30. Re: chronological order number needs in index numbers to be changed as ndash

> Is there a reason you'd want to use a tolerance of 0?

That's a left-over from the script that produced indexes directly (i.e. without InDesign's index feature, like Marc's Index Brutal), where "tolerance = 0" meant "don't span ranges". At first I thought that there's not much point anymore now that I've put an interface on it (http://tinyurl.com/25ydd4j), but I now interpret tolerance 0 as "not skipping anything". But maybe tolerance = 1 is better for that. Dunno.

> I like to make function interfaces as easy to use as possible.

> This version makes the dash and tolerance optional:

Yes, I've seen that in some of Marc's scripts too. It's a nice trick, but in this case there's always a span and a dash, so in this script optionality isn't relevant. And since in large indexes this function can be called hundreds of times, checking the options every time might slow things down.

Peter

• 31. Re: chronological order number needs in index numbers to be changed as ndash

pkahrel wrote:

...now that I've put an interface on it (http://tinyurl.com/25ydd4j)...

NICE!

Yes, I've seen that in some of Marc's scripts too. It's a nice trick, but in this case there's always a span and a dash, so in this script optionality isn't relevant. And since in large indexes this function can be called hundreds of times, checking the options every time might slow things down.

Peter

The amount of time it takes to check an undefined property is negligible.

I just did a test of one million checks and it took about 4.4 seconds in the ESTK and about 4 seconds in InDesign:

```var time = \$.hiresTimer;
textUndefined();
var endTime = \$.hiresTimer;
function textUndefined(){
var a={};
var i=1000000;
while(--i>0){
if(a.bla){
}
}
}

```

Interestingly enough, if I changed that check to explicitly check for undefined it takes about 2.9 seconds in ESTK and 2.5 seconds in InDesign (apparently the type conversion to a boolean costs):

```var time = \$.hiresTimer;
textUndefined();
var endTime = \$.hiresTimer;
function textUndefined(){
var a={};
var i=1000000;
while(--i>0){
if(a.bla==undefined){
}
}
}

```

Harbs

• 32. Re: chronological order number needs in index numbers to be changed as ndash

Interesting comparisons -- thanks.

P.

• 33. Re: chronological order number needs in index numbers to be changed as ndash

@Peter & Harbs:

I suppose the page_ranges function takes only arrays of well prepared numbers: sorted and single standing doubles eliminated when using tolerance 0.

See the following tests I did:

```//TESTS:
var a = [1,1,10,10,11,11,11,11,11,14,15,16,222,222,223,289];

\$.writeln("Tolerance: 0\t" + page_ranges (a, {tolerance: 0, dash: "-"}));
//Returns: 1-1,10-10,11-11,14,15,16,222-222,223,289

\$.writeln("Tolerance: 1\t" + page_ranges (a, {tolerance: 1, dash: "-"}));
//Returns: 1-1,10-11,14-16,222-223,289

\$.writeln("Tolerance: 3\t" + page_ranges (a, {tolerance: 3, dash: "-"}));
//Returns: 1-1,10-16,222-223,289

function page_ranges (array, obj)
{
obj = obj || {};
var temp = [];
var range = false;
var tolerance = obj.tolerance || 0;
var dash = obj.dash || "-";
for (var i = 0; i < array.length; i++)
{
temp.push (array[i]);
while (array[i+1] - array[i] <= tolerance)
{i++; range = true}
if (range){
temp[temp.length-1] += dash + array[i];
}
range = false;
}
return temp;
}
```

Seems array[0] should get a special treatment when it's contents is not part of a range and it is doubled…

Uwe

• 34. Re: chronological order number needs in index numbers to be changed as ndash

Hi Laubender,

I don't get that issue whith the version of formatRanges that I've posted above (#17).

The function supports "empty array, unsorted array, duplicated elems, negative values."

```//TESTS:
var a = [1,1,10,10,11,11,11,11,11,14,15,16,222,222,223,289];

alert("Tolerance: 0\t" + formatRanges(a, ', ', '-', 1, 0));
// Returns: 1, 10-11, 14-16, 222-223, 289

alert("Tolerance: 1\t" + formatRanges(a, ', ', '-', 1, 1));
// Returns: 1, 10-11, 14-16, 222-223, 289

alert("Tolerance: 2\t" + formatRanges(a, ', ', '-', 1, 2));
// Returns: 1, 10-16, 222-223, 289

function formatRanges(numbers, separator, joiner, minWidth, tolerance)
//----------------------------------------------------------
// Formats an array of integers into an ordered sequence of
// single numbers and/or ranges. Returns the formatted string.
//
// <numbers>     Array of Numbers [required]
//                    The integers to format. Supports: empty array,
//                    unsorted array, duplicated elems, negative values.
//
// <separator>     String [opt] -- Default value: ", ".
//                    A string inserted between each result.
//                    Ex.     formatRanges([4,1,3,8,9,6], " | ")
//                         => "1 | 3-4 | 6 | 8-9"
//
// <joiner>          String [opt] -- Default value: "-".
//                    A string used to format a range.
//                    Ex.     formatRanges([4,1,3,8,9,6], ", ", "_")
//                         => "1, 3_4, 6, 8_9"
//
// <minWidth>     Number [opt] -- Default value: 1.
//                    Minimum distance between the 1st and the last
//                    number in a range.
//                    Ex.     formatRanges([1,2,4,5,6,8,9,10,11], '', '', 1)
//                         => "1-2, 4-6, 8-11"
//                    Ex.     formatRanges([1,2,4,5,6,8,9,10,11], '', '', 2)
//                         => "1, 2, 4-6, 8-11"
//                    Ex.     formatRanges([1,2,4,5,6,8,9,10,11], '', '', 3)
//                         => "1, 2, 4, 5, 6, 8-11"
//
// <tolerance>     Number [opt] -- Default value: 0.
//                    Number of allowed missing numbers in a range,
//                    as suggested by Peter Kahrel (http://bit.ly/cABqIP)
//                    Ex.     formatRanges([2,3,5,8,12,17,23], '', '', 1, 0)
//                         => "2-3, 5, 8, 12, 17, 23"
//                    Ex.     formatRanges([2,3,5,8,12,17,23], '', '', 1, 1)
//                         => "2-5, 8, 12, 17, 23"
//                    Ex.     formatRanges([2,3,5,8,12,17,23], '', '', 1, 2)
//                         => "2-8, 12, 17, 23"
{
// Defaults
separator = separator || ", ";
joiner = joiner || "-";
if( minWidth !== ~~minWidth || minWidth < 1 ) minWidth = 1;
if( tolerance !== ~~tolerance || ++tolerance < 1 ) tolerance = 1;

// Init.
var a = numbers.concat().sort(function(x,y){return x-y;}),
sz = a.length,
n = sz && a[0],
d = sz || false,
i = 0, w = 0, t = 0,
ret = [];

// Loop
while( d !== false )
{
if( 0 === (d=(++i<sz)?a[i]-n:false) )
continue;      // skip duplicates

if( d && (d<=tolerance) )
{
ret.push(n);
n += d;
++w;
t += (d-1);
continue;
}

if( w >= minWidth )
{
ret.length -= w;
ret.push((n-w-t)+joiner+n);
}
else
{
ret.push(n);
}
n += d;
w = t = 0;
}

return ret.join(separator);
}
```

@+

Marc

• 35. Re: chronological order number needs in index numbers to be changed as ndash

Uwe,

That's right, unlike Marc's function, my version requires a sorted array without duplicate numbers. It's easy to fix that, though.

Peter

• 36. Re: chronological order number needs in index numbers to be changed as ndash

@Marc,

yes, I saw that as I tested your function as well. Thank you for your effort …
It's tough for me to exactly understand what's going on in your function elegant as it is.

Uwe

• 37. Re: chronological order number needs in index numbers to be changed as ndash

@Peter,

a pre-sorted array, yes. But it's so close to perfectly working with double entries.
Just post-processing the resuling temp array.

```function page_ranges (array, obj)
{
var temp = [];
var range = false;
//Sort input array:
array.sort(function(x,y){return x-y;});

for (var i = 0; i < array.length; i++)
{
temp.push (array[i]);
while (array[i+1] - array[i] <= obj.tolerance)
{i++; range = true}
if (range)
temp[temp.length-1] += obj.dash + array[i];
range = false;
}

for(var n=0;n<temp.length;n++){
var f = temp[n].toString().split(obj.dash);

if(f[0]===f[1]){
temp[n]=f[0];
};
};

return temp;
} // page_ranges
```

Uwe

• 38. Re: chronological order number needs in index numbers to be changed as ndash

Some new ideas on this algorithm:

http://www.indiscripts.com/post/2013/10/page-range-formatter

@+

Marc