• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Remove NULL character in XML by RegEx??

Explorer ,
Jul 11, 2008 Jul 11, 2008

Copy link to clipboard

Copied

I have XML being returned that appears to have NULL characters in it. When I try to use XMLParse() I get the following error:

An invalid XML character (Unicode: 0x0) was found in the element content of the document.

If I save the XML to .txt file then read it again I can parse it. That's not really the way I want to do it though as it'll be slow. I'm sure this can be done through a regex. Any ideas?
TOPICS
Advanced techniques

Views

4.3K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 12, 2008 Jul 12, 2008

Copy link to clipboard

Copied

I messed around with a RegEx and came up with this:

REReplace(thisXML,'[\x0]','','ALL')

It seems to work but I'm no unicode or regex expert. If someone who knows their stuff with RegEx and Unicode could review my RegEx and tell me if it's truly only removing NULLs that would be great.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jul 14, 2008 Jul 14, 2008

Copy link to clipboard

Copied

LATEST
Well, it's a relatively simple regex , so there isn't much to verifying it. You've got the right expression for hex code 0. I'm not sure you need the brackets at this point (indicating a character class), but it's easier to start with them so that you don't need to remember them once you find other characters to exclude.

As near as I can tell, it should be what you want. You may end up wanting a more complicated regex if you find other invalid characters you want to remove (like byte order marks), but that could be done in a separate statement.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation