This content has been marked as final. Show 5 replies
> I would like to strip (or replace) all non utf-8 characters from a string (for example a form-textfield). What is the most simple way to achieve that?
no such thing as non-utf8 chars. what exactly are you trying to do?
I would try to clean the string before sending it to a database (sql server)
> I would try to clean the string before sending it to a database (sql server)
ok, but strip it of what? everthing's in unicode. utf-8's a stingy multi-byte
encoding (ie it expands the bytes needed to represent a char only if needed) so
what exactly are you trying to get rid of?
when you paste a text (for example from MS Word) into a formfield and write the string to sql server database, you will see that some characters are replaced in the database as a symbol (square).
a square means either there's a slight encoding issue or more likely the font you chose to display these doesn't contain that glyph.
if your table is using one of "N" datatypes to hold your unicode text and you're using the JDBC driver instead (labeled as ms sql server) of the ODBC one then it's most likely a simple font issue.
got a public page i can see that shows this issue?