Java regex for support Unicode? -
to match z, use regex:
[a-za-z]
how allow regex match utf8 characters entered user? example chinese words 环保部
what looking unicode properties.
e.g. \p{l} kind of letter language
so regex match such chinese word like
\p{l}+ there many such properties, more details see regular-expressions.info
another option use modifier
pattern.unicode_character_class
in java 7 there new property pattern.unicode_character_class enables unicode version of predefined character classes see answer here more details , links
you this
pattern p = pattern.compile("\\w+", pattern.unicode_character_class); and \w match letters , digits languages (and of course word combining characters _).
Comments
Post a Comment