12 Replies Latest reply: May 2, 2012 7:01 AM by B4stien RSS

    Stripping all non-letters & non-numbers from a string (unicode)

    arvid terzibaschian Community Member

      Hi there!

       

      I am converting a bit of code from a java base and I am really stuck at a point where I need to strip all non-letters (unicode) from a string. In general I am searching for some kind of flex support to mimick some regexp expressions such as:

      \p{L} or \p{Letter}: any kind of letter from any language (seehttp://www.regular-expressions.info/unicode.html)

      and then use String.replace(/\p{L}/g,"") or something similar.

       

       

      If there is no such regexp facility I could still try to loop through the string character by character and check its unicode property bits for a set "isLetter()", as java and several other languages provide it.

       

      Just to make clear what I am searching for, I will give a short example:

      If we take a unicode string containing "this is a unicode [@@Русский@@] multilangual string containing some cyrillic letters and //]] 1234 numbers"

      it should strip out the @@ and [//, so basically everything BUT the letters.

       

      If anyone knows a decent solution I would appreciate any help!