<!ELEMENT E - o ( #PCDATA ) >
<!ATTLIST E
N NUMBER #REQUIRED
O NUMBER #REQUIRED
L #CDATA #IMPLIED >
And this un-normalized sgml fragment:
<e o="l" n="3" l="emile">emile </e>
<e n="4" o="23">georgina
<e n="5" o="1" l="holbach">holbach </e>
After normalization we would expect:
<E N="3" O="1" L="emile">emile </E>
<E N="4" O="23">georgina </E>
<E N="5" O="1" L="holbach">holbach </E>
So we see that:
$tag =~ m,<E N=\"([^\"]*?)\" O=\"([^\"]*?)\"( L=\"([^\"]*?)\")?>([^<]*?)</E>, ;
To match on the un-normalized SGML, we would need to handle: