I am trying to convert a PMString object into a UTF-8 char* object for use with POSIX functions on a Mac. The PMString object will contain multibyte characters like Chinese, Japanese etc. I receive this PMString object from a ScriptData object which only gives a PMString or a WideString.
Using PMString.GrabCString() causes the multibyte characters to appear as code values (<4E00>) in the char* strings. I've explored a couple of ways to convert a wide char* (wchar_t*) into a UTF-8 char* on a Mac. However, the basic problem seems to be that PMString internally stores the text in UTF-16 while wchar_t is 32 bit on Mac. As a result, when I call methods like GrabUTF16Buffer or GrabWString or even GetWChar_tString, I seem to be getting a corrupt wchar_t* string with UTF-16 characters stuffed into a 32 bit wide character array. I can't seem to form the same text back from the wchar_t* string using PMString(wchar_t*) or even explicit typecasting as PMString(UTF16Char*, numBytes).
To summarize, starting with a PMString, how do I get/convert it's contents into a valid UTF-8 char* string, knowing it will contain multibyte characters? Thank you for your time.
did you try the function StringUtils::ConvertWideStringToUTF8?
No I hadn't. Thank you very much for the tip. I've given it a try and on the face of it it seems to fit my use case. I'll confirm once I've made sure it works. Thanks a bunch!
StringUtils::ConvertWideStringToUTF8 works like a charm. Thanks again!
ScriptData has a GetFile() method. You should then use FileUtils and/or OSX services to extract the path.
Btw be careful with UTF8 vs. OSX Posix file names, if there is a chance your environment may use NFS mounts etc.
Apple's technote gives some details for driver developers, but the same problem also applies to us: