python - How to match a emoticon in sentence with regular expressions -


i'm using python process weibo (a twitter-like service in china) sentences. there emoticons in sentences, corresponding unicode \ue317 etc. process sentence, need encode sentence gbk, see below:

 string1_gbk = string1.decode('utf-8').encode('gb2312') 

there unicodeencodeerror:'gbk' codec can't encode character u'\ue317'

i tried \\ue[0-9a-za-z]{3}, did not work. how match these emoticons in sentences?

try

string1_gbk = string1.decode('utf-8').encode('gb2312', 'replace') 

should output ? instead of emoticons.

python docs - python wiki


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -