Copy link to clipboard
Copied
Hi all,
Was wondering if anyone can point me to the improvments to Data Merge in the newest version of InDesign (assuming there are some). We are specifically looking for a way to output single PDFs (500 page Data Merged InDesign Document saves as 500 PDF files for example) and each PDF being named using a column from the attached Excel file. I realise that it will be possible to have the javascripted, but was kind of hoping to avoid this with the newest InDesgn.
Thanks,
Graham
Copy link to clipboard
Copied
There are no changes in Data Merge in InDesign CS6.
Copy link to clipboard
Copied
Thanks Steve.
Copy link to clipboard
Copied
Have there been any changes to Data Merge since CS3?
Copy link to clipboard
Copied
If you're getting 500 separate PDF files, you're doing something wrong. You should be able to save your Excel file (with graphic file references) in tab-delimited TXT format, attach it to your InDesign template through Merge function, complete your layout by dragging fields, then generate your catalog. Check the Indesign Help function. Columns should be your Field names.
Copy link to clipboard
Copied
Cosmo,
Did you read the first post? The OP isn't getting separate PDFs, he WANTS separate PDFs and was hoping something had changed to allow it.
Copy link to clipboard
Copied
Hi all,
one of the Multimedia guys here wrote a javascript that saves the pdf pages is seperate files and names them using a column from an Excel file. Works quite nicely and I could upload it here in case anyone is interested.
Graham
Copy link to clipboard
Copied
Graham,
Please please do upload it to pastebin / here / similar, that would be immensely helpful.
Copy link to clipboard
Copied
Here you go. If you run the script, it will ask which csv file to use for the naming, and where its says 'PartnerHQ_Id' in my script, replace that with the column name you want to use. Let me know if anything is unclear; I am not really a Javascript kind of guy ...
/* Put script title here */
var CSV = function(data) {
var _data = data.split('\r\n');
for(var i in _data) {
if(_data.length > 0) {
console.println(i + ' ' + _data);
_data = _data.split(',');
}
}
var _head = _data.shift();
return {
length: function() {
return _data.length - 1;
},
getRow: function(row) {
return _data[row];
},
getRowAndColumn: function(row, col) {
if(typeof col !== 'string') {
return _data[row][col];
} else {
col = col.toLowerCase();
for(var i in _head) {
if(_head.toLowerCase() === col) {
return _data[row];
}
}
}
}
};
};
this.importDataObject("CSV Data");
var dataObject = this.getDataObjectContents("CSV Data");
var csvData = new CSV(util.stringFromStream(dataObject));
if(this.numPages != csvData.length()) {
app.alert("Number of pages & CSV row count inconsistent");
} else {
for(var i = 0; i < this.numPages; i++) {
this.extractPages({nStart: i, cPath: csvData.getRowAndColumn(i, 'PartnerHQ_Id') + '.pdf'});
}
}
Copy link to clipboard
Copied
not sure if it works in CS6 (am trying this in CS5.5 mac running OSX 10.5.8) but I get a dialog with:
Error 24
Error String: this.importDataObject is not a function
on line 37
anyone else tried this script yet?
Copy link to clipboard
Copied
Finally on cs6 and have retested GrahamHe's script. Still has the same error in InDesign, but that is because this is NOT and indesign script - it's an ACROBAT script! The script is applied using the action wizard.
So this doesn't behave the way I thought, such as running the script directly from indesign. The file is still merged to a single multi-page PDF file, and then the script is run via the action wizard.
Had trouble getting the script to work initially via acrobat but I did amend the second line to read
var _data = data.split('\r'); |
and it worked a treat!
One warning: the names in the column selected to become the filenames need to be unique (such as a primary key) otherwise there is a risk of files overwriting each other.
colly
Copy link to clipboard
Copied
Here's an improved version of that script:
If you have trouble running it in Acrobat's Actions panel, try running it in the Javascript console instead. Hit ctrl-J or cmd-J, then enable the console. copy and paste the code in, select the code you just copied and pasted, and hit ENTER (not return! on Mac) to run the code.
In my testing, (700kb PDFs, 600 or so records, 6gb RAM Mac Pro) it's pretty darn fast. Everything else I'd tried takes hours and hours or dies mid way, this takes less than a minute. Reading the CSV is probably the slowest part, then it spits out PDFs at a rate of several a second. It seems to take maybe 5% as long as the data merge takes.
Here's the improved code:
var CSV = function (data, delimiter) { var _data = CSVToArray(data, delimiter); var _head = _data.shift(); return { length: function () {return _data.length;}, adjustedLength: function () {return _data.length - 1;}, getRow: function (row) {return _data[row];}, getRowAndColumn: function (row, col) { if (typeof col !== "string") { return _data[row][col]; } else { col = col.toLowerCase(); for (var i in _head) { if (_head.toLowerCase() === col) { return _data[row]; } } } } }; }; function CSVToArray( strData, strDelimiter ){ strDelimiter = (strDelimiter || ","); var objPattern = new RegExp( ( // Delimiters. "(\\" + strDelimiter + "|\\r?\\n|\\r|^)" + // Quoted fields. "(?:\"([^\"]*(?:\"\"[^\"]*)*)\"|" + // Standard fields. "([^\"\\" + strDelimiter + "\\r\\n]*))" ), "gi" ); var arrData = [[]]; var arrMatches = null; while (arrMatches = objPattern.exec( strData )){ var strMatchedDelimiter = arrMatches[ 1 ]; if ( strMatchedDelimiter.length && (strMatchedDelimiter != strDelimiter) ){ arrData.push( [] ); } if (arrMatches[ 2 ]){ var strMatchedValue = arrMatches[ 2 ].replace( new RegExp( "\"\"", "g" ), "\"" ); } else { var strMatchedValue = arrMatches[ 3 ]; } arrData[ arrData.length - 1 ].push( strMatchedValue ); } return( arrData ); } function isInt(n) { return typeof n === "number" && n % 1 == 0; } var prepend = app.response("Enter any text to go at the START of each filename:"); var append = app.response("Enter any text to go at the END of each filename:"); var pathStr = app.response("If the PDFs should be saved in a sub folder, enter the relative path here:", "", "pdf/"); this.importDataObject("CSV Data"); var dataObject = this.getDataObjectContents("CSV Data"); var csvData = new CSV(util.stringFromStream(dataObject, 'utf-8'), ','); var pagesPerRecord = this.numPages / csvData.length(); if (isInt(pagesPerRecord)) { for (var i = 0; i < this.numPages; i ++) { var pageStart = i*pagesPerRecord; var pageEnd = (i+1)*pagesPerRecord - 1; var recordIndex = (i + pagesPerRecord) / pagesPerRecord; var filename = csvData.getRowAndColumn(i, "filename"); if (!filename) { app.alert('No filenames found - using "file-XX.pdf". Press Escape after continuing to cancel.'); filename = "file-" + i; } var settings = {nStart: pageStart, nEnd: pageEnd, cPath: pathStr+prepend+filename+append+'.pdf'}; this.extractPages(settings); } } else { var message = "The number of pages per row is not an integer (" + pagesPerRecord; message += ", " + this.numPages + " pages, " + csvData.length() + " rows)."; }
Copy link to clipboard
Copied
It seems this script gets slower the more hyperlinks there are in the document:
In each case it's "Saving PDF..." that is the point the process struggles with. I don't know what it is about the addition of hyperlinks that makes the PDFs so slow to save.
I think it's maybe something to do with cross-reference tables - the final PDFs are 25% cross reference tables according to Optimise PDF's audit tool, and there are no cross references so I can only guess that this is the hyperlinks. Why so much data, I have no idea.
Copy link to clipboard
Copied
Thanks for posting your revised script, it would seem that it should do exactly what I require.
I think I've set it up correctly as an Action in Acrobat.
I have a 137 page PDF which I want to split into single page documents. I have a .csv file with a list of 137 unique IDs, which I would like Acrobat to use to name the individual files.
I run the action and get prompted to input anything to prefix and suffix the filename with, together with an option of a relative path.
I then get asked to choose a data file, so I choose my CSV list.
The script seems to run ok but I get no output.
I'm guessing at some point it should ask me for column name and also how many pages per file output??
Or do I need to edit that in the script first? If so, which are the parts to edit?
THanks in advance,
Ben
Copy link to clipboard
Copied
Hey Ben, sorry looks like I wasn't clear - the CSV file must contain a column headed "filename" (lowercase). It pretty much only uses that one column, so it can be a one column CSV. It's written into the code.
If you want it to be variable (e.g. if you get CSVs you can't edit), add a line around line 67 like:
var columnHeader = app.response("Enter the column header of the filename column in the CSV", "", "filename");
...then find/replace "filename" (including quotes) with columnHeader.
A few other tips:
Copy link to clipboard
Copied
Hi,
hoping someone can help.
I have a 48 page document which I need to split into 8 documents. I have the excel spreadsheet with column headed "filename" with 8 file names.
When I select the data file to import after being asked to enter text at start of filename etc I see the message 'No filenames found - using "file-XX.pdf". Press Escape after continuing to cancel.'
and then the error -
RaiseError: The file may be read-only, or another user may have it open. Please save the document with a different name or in a different folder.
Doc.extractPages:83:Console undefined:Exec
===> The file may be read-only, or another user may have it open. Please save the document with a different name or in a different folder.
file-0
I have checked the number of rows in the csv with the one which i created the merge in Indesign and it is the same. I have also tried usinh google docs spreadsheet but acrobat doesnt recognise the URL.
Thank you.
Copy link to clipboard
Copied
Sounds like you've got the CSV file open in Excel or something similar.
Certainly with Excel for Mac, it gets stroppy if the CSV file is open - Excel won't let the script read it. Not sure about any other programs but I imagine they're similar.
Make sure the CSV is closed before running the script. I find it works best to have an excel version and a CSV version, keeping the excel version open to make any edits and closing the CSV versions as soon as they're saved.
Copy link to clipboard
Copied
Thanks for your reply.
I tried with exitng Excel but the same problem occured.
Copy link to clipboard
Copied
Hi alanomaly,
I managed to get acrobat to read a csv which was created in officelibre(as you mentioned) but I just have one more issue.
I am getting the error message 'The number of pages per row is not an integer (5.454545454545454, 60 pages, 11 rows).' which makes me believe that it doesnt realise I have 6 page documents.
Thanking you in advance.
Copy link to clipboard
Copied
Hi, Did you receive an answer to this issue? I get the same error:
The number of pages per row is not an integer (5.333333333333333, 16 pages, 3 rows).
How does the code know how many pages each document should be?
Copy link to clipboard
Copied
If it's saying "'The number of pages per row is not an integer (5.454545454545454, 60 pages, 11 rows)" that means that the number of pages isn't divisible by the number of rows. So, yvesha has 60 pages and 11 rows, which means 5.4545 pages per row, which doesn't make sense, you can't have a PDF with 5.4545 pages. It sounds like you're aiming for 6 pages per row, which means there should be 10 rows in your CSV ( 60 / 10 = 6 ).
TStanwood has 16 pages and 3 rows, but 3 doesn't go into 16. This could mean there's too many pages in the Indesign file (maybe there's one extra page and it's supposed to be 3 five-page PDFs?) or the wrong number of rows in the CSV.
Sometimes Excel adds empty rows to CSVs for no reason other than wanting to ruin your day... if the number of rows the script says you have doesn't match the actual number of rows you're seeing in the CSV in Excel, open the CSV with a plain text editor and delete any empty lines of text.
Copy link to clipboard
Copied
The issue is I have two rows, but it's counting the header row as one.
Todd
Copy link to clipboard
Copied
Hi, thanks for responding. I only have two rows but it's counting the header row as one making it three. Todd
Copy link to clipboard
Copied
How do I set this script/action up + execute it?
What does the process look like from CSV to InDesign to Acrobat to final PDF?
I'm slow!
Copy link to clipboard
Copied
Hi Alan,
Thanks for the script, but like ben, it seems to run but doesn't actually give any output - When clicking the full report it states that it has been successful but only takes less than a second to complete and there are no files.
I don't know if I am putting the right info in the dialogue box that asks: "If the pdf should be saved in subfolders..." - what info needs to go here?
Really hoping I can get this to work