hi People
I have a little script which reads the source text from a layer and saves it to a .txt file. This is on a Mac and all was good until recently when I tried opening the .txt file on a PC in Notepad and found my ˚ degree symbols all whack.
Resaving the .txt file in TextEdit as Unicode (UTF-8) encoding solved the problem, now opens fine in Notepad.
But ideally I'd like the script to output the .txt as UTF-8 in the first place. It's currently Western (Mac OS Roman). I've tryed adding in myfile.encoding = "UTF8" but the resulting file is still Western (and the special charaters have wigged out again)
any help greatly appreciated../daniel
{
var theComp = app.project.activeItem;
var dataRO = theComp.layer("dataRO").sourceText;
// prompt user to save file
var theFile = new File ("~/Desktop/"+ theComp.name + "_output.txt");
theFile = theFile.saveDlg("Save an ASCII export file.");
if (theFile != null) { // check user didn't cancel dialog
theFile.lineFeed = "windows";
//theFile.encoding = "UTF8";
theFile.open("w","TEXT","????");
theFile.writeln("move details:");
theFile.writeln(dataRO.value.toString());
}
theFile.close();
}
Hi,
I remember working hard two years ago on creating a correct text file on OSX, but did not remember if it was a utf-8 case or anything. As my home computer is not a mac, I have no mean to test it tonight, but anyway, here is the big line of it. :
var theFile= new File(.........);
theFile.open("w", "TEXT");
theFile.encoding = "BINARY"
theFile.linefeed = "Unix"
theFile.writeln("éàçËôù")
theFile.close();
Let me know if it is working.
Hi, I was just looking at how a text software knows what is the text encoding of a file is and I found that on wikipedia. http://en.wikipedia.org/wiki/Byte_order_mark
So I created a utf8 file in notepad, and look at the binary. At the start of the file, there is those caracters : 0xEF,0xBB,0xBF or 
So you should try to add those characters at the start of the file.
var theFile= new File(.........);
theFile.open("w", "TEXT");
theFile.encoding = "BINARY"
theFile.linefeed = "Unix"
theFile.write("");//or theFile.write(String.fromCharCode (0xEF) + String.fromCharCode (0xEB) + String.fromCharCode (0xBF)
theFile.write("Your stuff éàçËôù");
theFile.close();
chauffeurdevan, thanks for the suggestion. I tried a few varients with differing results but no real success.
From my testing it seems the main problem is TextEdit (and quick preview, which is kind of a bummer).
If I save the file using theFile.encoding = "UTF-8" it opens perfectly on the PC. Interestingly the same file opened in TextMate on the MAC works fine too as textMate somehow interprets it as UTF-8. For some reason TextEdit assumes it is Western (Mac OS) and shows garbled characters. Inserting the BOM characters and encoding as Binary made TextEdit think it was UTF-8 but it couldn't actually open the file at all ?
So the mystery remains as to how to write a UTF-8 file that both TextEdit and NotePad display correctly but at least I have a way of writing a file that the NotePad displays properly which was the main aim... so perhaps 75% of a solution !
Hi,
Got it, it seems, the utf-8 standard use 2-bytes (and more) encoding on accents and special characters.
I found some info there with some code http://ivoronline.com/Coding/Theory/Tutorials/Encoding%20-%20Text%20-% 20UTF%208.php
However there was some error so I fixed it. (However for 3 and 4 bytes characters i didnt test it. So maybe you'll have to change back the 0xbf to 0x3f or something else.)
So here is the code.
| Header 1 |
|---|
function convertCharToUTF(character){ var utfBytes = ""; c = character.charCodeAt(0) if (c < 0x80) { utfBytes = String.fromCharCode (c); } else if (c < 0x800) { utfBytes = String.fromCharCode (0xC0 | c>>6); utfBytes += String.fromCharCode (0x80 | c & 0xbF); } else if (c < 0x10000) { utfBytes = String.fromCharCode (0xE0 | c>>12); utfBytes += String.fromCharCode (0x80 | c>>6 & 0xbF); utfBytes += String.fromCharCode (0x80 | c & 0xbF); } else if (c < 0x200000) { utfBytes += String.fromCharCode (0xF0 | c>>18); utfBytes += String.fromCharCode (0x80 | c>>12 & 0xbF); utfBytes += String.fromCharCode (0x80 | c>>6 & 0xbF); utfBytes =+ String.fromCharCode (0x80 | c & 0xbF); } return utfBytes } function convertStringToUTF(stringToConvert){ var utfString = "" for (var i = 0 ; i < stringToConvert.length; i++){ utfString = utfString + convertCharToUTF(stringToConvert.charAt (i)) } return utfString; }
var theFile= new File("~/Desktop/_output.txt"); theFile.open("w", "TEXT"); theFile.encoding = "BINARY" theFile.linefeed = "Unix" theFile.write("");//or theFile.write(String.fromCharCode (0xEF) + String.fromCharCode (0xEB) + String.fromCharCode (0xBF) theFile.write(convertStringToUTF("Your stuff éàçËôù")); theFile.close(); |
North America
Europe, Middle East and Africa
Asia Pacific