Truncating the end of a string in R after a character that can be present zero or more times -


i have following data:

temp<-c("air bags:frontal" ,"service brakes hydraulic:antilock",     "parking brake:conventional",     "seats:front assembly:power adjust",     "power train:automatic transmission",     "suspension",     "engine , engine cooling:engine",     "service brakes hydraulic:antilock",     "suspension:front",     "engine , engine cooling:engine",     "visibility:windshield wiper/washer:linkages") 

i create new vector retains text before first ":" in cases ":" present, , whole word when ":" not present.

i have tried use:

temp=data.frame(matrix(unlist(str_split(temp,pattern=":",n=2)),  +                        ncol=2, byrow=true)) 

but not work in cases there no ":"

i know question similar to: truncate string character in r, used:

sub("^[^.]*", "", x) 

but not familiar regular expressions , have struggled reverse example retain beginning of string.

you can solve simple regex:

sub("(.*?):.*", "\\1", x)  [1] "air bags"                  "service brakes hydraulic"  "parking brake"             "seats"                      [5] "power train"               "suspension"                "engine , engine cooling" "service brakes hydraulic"   [9] "suspension"                "engine , engine cooling" "visibility"      

how regex works:

  • "(.*?):.*" repeated set of characters .* modify ? not greedy. should followed colon , character (repeated)
  • substitute entire string bit found inside parentheses - "\\1"

the bit understand regex match greedy default. modifying non-greedy, first pattern match can not include colon, since first character after parentheses colon. regex after colon default, i.e. greedy.


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -