javascript - Mysterious garbage character - IE 8 only -
i building table, content pulled other elements in page (page scraping).
i using innertext or textcontent pull text, regular expression trim it:
string.replace(/^\s+|\s+$/g,""); this works fine in ie 9 , chrome, in ie 8 getting garbage character cannot identify. able reproduce behavior alerts in jsfiddle:
what character, , how can rid of it?
update: helpful replies! seems character in question u200e (left right mark). second part of question remains, how can rid of such characters regular expressions, , keep regular text?
both "at risk" , "complete" <th> tags in jsfiddle snippet have u+200e (left-to-right mark, aka lrm) code point @ end of content. not whitespace character, cannot matched \s.
one way rid of character use xregexp library, can replace matches of \p{c} empty string (i.e., delete them). \p{c} matches code point in unicode's "other" category, includes control, format, private use, surrogate, , unassigned code points. u+200e, specifically, within \p{cf} "other, format" subcategory.
Comments
Post a Comment