regex - regular expression for finding and replacing variable URL strings in XML -


i'm having difficulty figuring out regular expression stripping part of string within particular xml tag , replacing it. have number of url paths variable parts, need find between string , last slash in url. example, might have tags , urls this:

<bpoc:resourcemetadataloc>http://app01/media/images/i//1951-1960_embark_object_photos/1957.59.jpg</bpoc:resourcemetadataloc>

or

<bpoc:resourcemetadataloc>http://app01/media/images/contemporary/1986-2005/1991.2.jpg</bpoc:resourcemetadataloc>

the output should like

<bpoc:resourcemetadataloc>http://app01/media/previews/1957.59.jpg</bpoc:resourcemetadataloc>

this far got, captures last slash in string, , not second-to-last slash:

(<bpoc:resourcemetadataloc>http://app01/media/images)+(.*[/])

that regex capture following:

<bpoc:resourcemetadataloc>http://app01/media/images/i//1951-1960_embark_object_photos/1957.59.jpg</

what need add regex exclude </bpoc:resourcemetadataloc> bit query , capture prior last slash in url?

because xml, there can't (non-escaped) < or > in url itself. can use advantage:

<bpoc:resourcemetadataloc>http://app01/media/images[^<]*/([^<]*) 

this should capture last segment (e.g. "1957.59.jpg") of url. works greedily matching start of end-of-tag (the first [^<]*), backtracking match nearest (i.e. last) /, capturing after slash (the ([^<]*)) group 1 can use during replacement step.


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -