When you select the .csv from the file dialog, check the "Import Options" box. You can choose encoding and platform in the subsequent dialog box.
Multi-paragraph fields are a problem when a carriage return is used as a separator. One solution is to find such fields in the db, use a regex to replace paragraph markers in the multi-paragraph fields with some other marker, then a second regex in InDesign after importing the .csv to put the paragraph markers back in.
i am working on a mac.
i am trying to place a text file to synch with my indesign styles.
text file has the first line:
the "import Options" box shows in "Character Set:" Unicode UTF-16 with no option to change
file places but does not pick up styles but come in as:
<pstyle:Community name>(Enota) Mountain Retreat
seems like <UNICODE-MAC> is not the correct first line code
Oh, InDesign Tagged Text! I totally missed that. I haven't used it in a loooong time, but I remember solving encoding problems like this one. It sounds like you are placing raw text, not Tagged Text. What file extension does your placed file have?
thanks for getting back to me.
extension is .txt
it has worked when i changed the <UNICODE-MAC> TO <ANSI-MAC> but some of the characters are apparently not ANSI characters and come in incorrectly.
that makes it seem like some adjustment to <UNICODE-MAC> is needed.
I don't know about that. When I have an InDesign Tagged Text with the <UNICODE-MAC> header file that I place using File -> Place, I don't see the <pstyle: stuff that you see. InDesign successfully parses the Tagged Text file and places the non-ANSI text (in my case, Cyrillic) in the file. The fact that you are seeing the contents of the text file ("<pstyle:"), and it's being generated on another platform by a database, tells me that your tagged text file may not have the correct syntax.
What's the next entry in your file, after <UNICODE-MAC>? Is it something like this?
the next line is:
i think the incorrect syntax is in the first line somehow.
everything else has been the same with a first line that has worked but with a different character set -- ANSI.
Ah, no. You've got the encoding declaration correct, so that's not the issue. Are you sure that the encoding of the textfile is actualy UTF-16? Is it really Mac-platform? Can you share your tagged text file?
i don't see a way to attach a file here.
Put it on a file sharing site like Dropbox or WeTransfer and post a link here.
I'm not tagged text expert, but it looks to me like youare missing some stuff, like the feature set definition: <fset:InDesign-Roman> and possibly the color table information, both of which appear in the same line as the version number in my test sample.
Adobe Tagged Text needs to be--for UTF-16--encoded as UTF-16 LE (Little Endian) and line endings likely need to be UNIX Line Feeds, not Mac carriage returns.
For a PC (which is what I use), the only difference is the line endings need to be Windows CR/LF type.
You can use TextWrangler on a Mac to open and save the output from the database to the above UTF-16 Little Endian format.
Mike nailed it, pretty much - I have only one quibble.
In general, you can use a variety of encodings to save tagged text - you're not limited to UTF-16. (At least, you weren't back in CS4, the last time I used Tagged Text for anything at all.) However, you have to have the right kind of line termination to match your platform - here, Mac carriage returns. Your text needs to be encoded in a way that supports all the characters you're using, and if you want that Unicode RIGHT SINGLE QUOTATION MARK in "WWOOF'ing" on line 7, then your file should be saved as Unicode, and the header should declare it.
So: if you want to declare as ANSI-MAC then you have to strip out your fancy quotes before exporting tagged text from the DB. I did that - I changed the right single quote to a straight single quote - and the Tagged Text interpreter successfully. Well, successfully but for the fact that you only gave us a fragment, and it threw a gigantic list of errors.
You should instead declare Unicode, as your Linux DB wrangler wants to do. But then you should save with Windows platform CRLF, not Unix LF as Mike suggests. Maybe there's something new in Tagged Text that I don't know about, but the encoding & platform declaration specified in the "Using InDesign Tagged Text" reference include only WIN and MAC.
Start file tag:
Specify the encoding format (ASCII, ANSI, UNICODE, SJIS, CGB18030, BIG5, or KSC5601) followed by the platform (MAC or WIN).
So, grab the text file supplied by your database wrangler, open it in TextWrangler, then convert to UTF-16, Windows platform, then change the header to <UNICODE-WIN>,and then resave the resulting file. Then your styles will come in correctly, without any need to strip out your fancy curly quotes.
Okay, I just re-read my post, and then did some experimenting. It's fishy. You can save your file as UTF-16 for Mac platform, declare its encoding and platform correctly, and and it won't import tagged text; it'll import it as raw text. You can do pretty much anything else - it works with Shift-JIS encoding and Mac CR, or Big5 and Mac CR, for example - and it interprets the tagged text correctly. It's only Unicode-declared text with Mac platform that is the problem.
i will send it on to my db guy.
could you explain what you did?
He did exactly what he previously suggested and I couldn't get to work - he saved with Unicode UTF-16 encoding and Unix platform line-endings.
It's all in the file name. So:
UTF-16 Little Endian with Unix line endings.
Do note that the LE (Little Endian) format is necessary if I recall. BE (Big Endian) will not work.
If your developer has issues, post back into this thread so I get notification as I am not always around. You can also feel free to PM me here as I will then also receive an email of that communication.
I checked your text file and it is imported ok on my Mac's InDesign CS6, if I had all the used paragraph styles in the Paragraph Styles panel.
One thing is odd with the tagged text.
There are three different paragraph styles defined in one single paragraph divided by a tabulator.
This cannot be correct, I think.
Maybe from an attempt to represent a one-row table with three cells by the developer?
I would likely think there are intended character styles versus tables. But hey, I dunno. As well, style names are way larger/longer than they ought to be in my opinion.
hi mike, et al.,
yes, the style names are longer than they need to be.
i was just trying to not create extra work for the db guy by getting him to rename his fields.
the multiple styles in a paragraph are to get various fields on the same line in the text.
your file worked for me but the latest effort here still did not work.
i'm waiting to get the next export to see if he can work it out.
if not i will try and get him to communicate directly.
really appreciate all the effort to help us out.
what is PM?
PM = Private Message.
Unless (or until) I return to the forum, I will not receive a notification that there is a reply to the thread (it's shut off). However, a PM to me will send an email to my gmail account and so even if I am not around to visually see the notification icon at the top of the forum page, I'll receive the email letting me know to come back to the forum.
If all else fails (or even just a route to try), is to produce it as a UTF-16 with Win/Dos line endings and try it. As well, you should be able to simply convert it yourself no matter the encoding using any decent text editor.
Joel - (no access to my Mac at the moment): I've had my fair share of problems getting Tagged Text to work properly, and in the end I settled for Unicode 16LE Mac encoded line endings as well. (Which is a drag because then you cannot use <LF> for soft breaks - and there is no 'tag' for it.)
Can you check if the BOM could have been an issue? I use TextWrangler to debug such problems, as it offers saving both with or without a BOM.