c# - Regex for all uppercase words until lowercases -
i have sentence like:
name: john j. smith sometag: how grab john j smith part?
sometag not same more all-capitalized words until 1 not.
update
"[a-z. ]*" returns john j. smith s
"[a-z. ]*\b" returns nothing as
"\b[a-z. ]*\b"
try this
[a-z. ]*\b let me know how goes
you can be more complete one
[\p{lu}\p{m}\p{z}\p{n}\p{p}\p{s}]*\b but mouthful
match single character present in list below «[\p{lu}\p{m}\p{z}\p{n}\p{p}\p{s}]*» between 0 , unlimited times, many times possible, giving needed (greedy) «*» character unicode property “uppercase letter” (an uppercase letter has lowercase variant) «\p{lu}» character unicode property “mark” (a character intended combined character (e.g. accents, umlauts, enclosing boxes, etc.)) «\p{m}» character unicode property “separator” (any kind of whitespace or invisible separator) «\p{z}» character unicode property “number” (any kind of numeric character in script) «\p{n}» character unicode property “punctuation” (any kind of punctuation character) «\p{p}» character unicode property “symbol” (math symbols, currency signs, dingbats, box-drawing characters, etc.) «\p{s}» assert position @ word boundary «\b» or shorter
\p{ll}*\b update 1
after edit use this
name: (\p{ll}*)[ ] the desired match in group 1. note added [ ] in end signal single space. can convert character class space if want.
in c# becomes
string resultstring = null; try { regex regexobj = new regex(@"name: (\p{ll}*)[ ]"); resultstring = regexobj.match(subjectstring).groups[1].value; } catch (argumentexception ex) { // syntax error in regular expression }
Comments
Post a Comment