Java regex for support Unicode? -


to match z, use regex:

[a-za-z]

how allow regex match utf8 characters entered user? example chinese words 环保部

what looking unicode properties.

e.g. \p{l} kind of letter language

so regex match such chinese word like

\p{l}+ 

there many such properties, more details see regular-expressions.info

another option use modifier

pattern.unicode_character_class

in java 7 there new property pattern.unicode_character_class enables unicode version of predefined character classes see answer here more details , links

you this

pattern p = pattern.compile("\\w+", pattern.unicode_character_class); 

and \w match letters , digits languages (and of course word combining characters _).


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -