python - How to match a emoticon in sentence with regular expressions -
i'm using python process weibo (a twitter-like service in china) sentences. there emoticons in sentences, corresponding unicode \ue317 etc. process sentence, need encode sentence gbk, see below:
string1_gbk = string1.decode('utf-8').encode('gb2312') there unicodeencodeerror:'gbk' codec can't encode character u'\ue317'
i tried \\ue[0-9a-za-z]{3}, did not work. how match these emoticons in sentences?
try
string1_gbk = string1.decode('utf-8').encode('gb2312', 'replace') should output ? instead of emoticons.
Comments
Post a Comment