Split on Opening and Closing Regular Expression

It basically looks in the given string for an opening and closing regular expression and returns the text before, the text that matched the opening regular expression, the text inbetween, the text that matched the closing regular expression and the text that is left over. The text inbetween is treated recursively. Informally speaking, the function splitBalancedParentheses split at the top-level LISP expression.

it is even possible to extract text if the opening and closing regular expressions are identical. Of course the function will not recurse in such a case.

Note that regular expressions are allowed that can match several characters. It is also possible that the input string contains newlines between the parentheses.

The function splitBalancedParentheses works as follows. It looks for the opening regular expression. If such an expression cannot be found, the function indicates it by leaving the first four return parameters empty and gives back the first input parameter as the fifth entry. If the opening regular expression has been found then it saves the text that come before and the matching string for the opening regular expression. It then tries to find the matching closing regular expression by means of an auxiliary function findClosingParenthesis . It is a fatal error if no closing regular expression can be found.

317⟨split on opening and closing regular expression 317⟩≡   (326)  318 ⊳
sub splitBalancedParentheses {
    my ($s, $openparen, $closingparen)=@_;
    my ($before, $oparen, $text, $cparen) = (’’, ’’, ’’, ’’);
    my ($after) = $s;
    if ($s =~ /($openparen)/) {
        $before = $‘;
        $oparen = $1;
        ($text, $cparen, $after) =
          findClosingParenthesis($’, $openparen, $closingparen);
    }
    return ($before, $oparen, $text, $cparen, $after);
}

Defines:
splitBalancedParentheses, used in chunk 304.

Uses findClosingParenthesis 318.

318⟨split on opening and closing regular expression 317⟩+≡   (326)  ⊲317
sub findClosingParenthesis {
    my ($s, $openparen, $closingparen)=@_;
    my ($before, $open, $paren, $text, $close, $after);
    my ($closingParenFound)=0;
    my ($t, $p);
    $paren = ’’;
    $text = ’’;
    while ($s =~ /($openparen|$closingparen)/) {
        $text .= $‘; # the part before the match
        $s = $’; # the part after the match
        $paren = $1;
        if ($paren =~ /$closingparen/) {$closingParenFound=1; last;}
        ($t, $p, $s) =
          findClosingParenthesis($s, $openparen, $closingparen);
        $text .= "$paren$t$p";
    }
    if ($closingParenFound==0) {die "Closing parenthesis not found";}
    return ($text, $paren, $s);
}

Defines:
findClosingParenthesis, used in chunk 317.

26.5 Split on Opening and Closing Regular Expression