Question : Regex To Look-Behind So To Replace Character

I am using PHP's preg_replace() which takes Perl regular expressions,

I have this regex to find any occurance of & to then convert to & to make page XHTML valid. It will skip HTML entities etc...
/&(?!(amp|#x\d{2,4}|#\d{2,4});)/

What I want is a regex to do the same but to replace every semicolon with ;
But as you can see HTML entities contain a semicolon and dont want them to be matched and replaced

I tried this...
/(?2,4});/

But look-behind doesnt support that since its not fixed lengths. Is there a way to do a regex so it can do that?


Regards,
Nick

Answer : Regex To Look-Behind So To Replace Character

ok, in perl you can do it like this:

$x=reverse $your_string;
$x=~s:;(?\!([a-z]+|[0-9a-f]{2,4}[x#])&):;b3#&:g;
print scalar reverse($x);

KISS - keep it stupid simple

explanation:
  what you want to do is a look-behind, *not* a look ahead nor a negative look ahead
  (think in reading forward western style: from left to right, then ahead is right of current, and behind is left of current)
  AFAIK there exists no regex engine which can do a look-behind.
  So I did a simple trick and reversed the string, now we've look-ahead, things are simple then.

A note about performance:
  also in perl I don't know a way to force for reverse "in memory", means that it does not allocate new memory
  so you have to deal with at least twice the memory of $your_string,
  don't feel save with this:
    $x=reverse $a;
    $a=reverse $x;
  it's a copy! perl's memory management is too clever for that ;-)
But there're are good news too: using  s/[a-zA-Z]/i  is much faster than  s/[a-z]/i, keep this in mind when improving my suggestion.

My PHP knowlege isn't good enough to tell you if PHP can manage such real perlisch things (even it boasts with "perl RegEx"), please try yourself. I'm interested too.
Random Solutions  
 
programming4us programming4us