• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

stripping non utf-8 characters from string

Guest
Sep 16, 2008 Sep 16, 2008

Copy link to clipboard

Copied

hello all,

I would like to strip (or replace) all non utf-8 characters from a string (for example a form-textfield). What is the most simple way to achieve that?

thanks in advance,
rudy struyf
TOPICS
Advanced techniques

Views

1.3K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 16, 2008 Sep 16, 2008

Copy link to clipboard

Copied

alpenman69 wrote:
> I would like to strip (or replace) all non utf-8 characters from a string (for example a form-textfield). What is the most simple way to achieve that?

no such thing as non-utf8 chars. what exactly are you trying to do?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Sep 16, 2008 Sep 16, 2008

Copy link to clipboard

Copied

I would try to clean the string before sending it to a database (sql server)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 16, 2008 Sep 16, 2008

Copy link to clipboard

Copied

alpenman69 wrote:
> I would try to clean the string before sending it to a database (sql server)

ok, but strip it of what? everthing's in unicode. utf-8's a stingy multi-byte
encoding (ie it expands the bytes needed to represent a char only if needed) so
what exactly are you trying to get rid of?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Sep 17, 2008 Sep 17, 2008

Copy link to clipboard

Copied

when you paste a text (for example from MS Word) into a formfield and write the string to sql server database, you will see that some characters are replaced in the database as a symbol (square).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Sep 17, 2008 Sep 17, 2008

Copy link to clipboard

Copied

LATEST
a square means either there's a slight encoding issue or more likely the font you chose to display these doesn't contain that glyph.

if your table is using one of "N" datatypes to hold your unicode text and you're using the JDBC driver instead (labeled as ms sql server) of the ODBC one then it's most likely a simple font issue.

got a public page i can see that shows this issue?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation