Normalization of Data

You get a lot of benefit from normalization of SGML: Our favorite normalizer is sgmlnorm from James Clark's SP.

command: sgmlnorm doctype_file sgml_file > output_sgml_file

Here is an example of how normalization might change an sgml document and some detail on how this eases parsing.


Do look at all of James Clark's SGML/XML tools.


Normalization: Hands On

To get more of a feel for the process we'll use the bosnia Makefile to do the necessary normalization (sgmlnorm) step.  But before we can normalize the data it must be transformed.  The <PB> (pagebreak) tags are processed and their attributes
and values are changed to conform to the expectations of the Page Viewer.  After the <PB> tags have been "munged" we will also use the Makefile to check for valid sgml before normalizing.   This runs nsgmls, James Clark's parser.
 

% cd $HOME/dlxs/idx/b/bosnia
% make noded
% make validate
% make norm