Extracting Substrings
Problem
You want to extract the substring preceding or following a particular match.
Solution
Use the preg_split() function to split the original string into an array delimited by
the match term, and then extract the appropriate array element(s):
<?php
// define string
$html = “Just when you begin to think the wagon of ↵
<a name=’#war’>Vietnam</a>-grounded movies is grinding to a slow halt, ↵
you’re hit squarely in the <a name=’#photo’>face</a> with another ↵
one. However, while other movies depict the gory and glory of war ↵
and its effects, this centers on the ↵
<a name=’#subject’>psychology</a> of troopers before ↵
they’re led to battle.”;
// split on <a> element
$matches = preg_split(“/<a(.*?)>(.*?)<\/a>/i”, $html);
// extract substring preceding first match
// result: “Just when…of”
echo $matches[0];
// extract substring following last match
// result: “of troopers…battle.”
echo $matches[sizeof($matches)-1];
?>
C h a p t e r 1 : Wo r k i n g w i t h S t r i n g s 25
Comments
The preg_split() function accepts a regular expression and a search string, and
uses the regular expression as a delimiter to split the string into segments. Each
of these segments is placed in an array. Extracting the appropriate segment is then
simply a matter of retrieving the corresponding array element.
This is clearly illustrated in the previous listing. To extract the segment preceding
the first match, retrieve the first array element (index 0); to extract the segment
following the last match, retrieve the last array element.
If your match term is one or more regular words, rather than a regular expression,
you can accomplish the same task more easily by explode()-ing the string into an
array against the match term