javascript - Mysterious garbage character - IE 8 only -


i building table, content pulled other elements in page (page scraping).

i using innertext or textcontent pull text, regular expression trim it:

string.replace(/^\s+|\s+$/g,""); 

this works fine in ie 9 , chrome, in ie 8 getting garbage character cannot identify. able reproduce behavior alerts in jsfiddle:

http://jsfiddle.net/te4fq/

what character, , how can rid of it?

update: helpful replies! seems character in question u200e (left right mark). second part of question remains, how can rid of such characters regular expressions, , keep regular text?

both "at risk" , "complete" <th> tags in jsfiddle snippet have u+200e (left-to-right mark, aka lrm) code point @ end of content. not whitespace character, cannot matched \s.

one way rid of character use xregexp library, can replace matches of \p{c} empty string (i.e., delete them). \p{c} matches code point in unicode's "other" category, includes control, format, private use, surrogate, , unassigned code points. u+200e, specifically, within \p{cf} "other, format" subcategory.


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -