Here is a "filtering" exercise, to get your Perl regexps in gear in preparation for the DLXS Programmer's Workshop. Email pagliere@umich.edu to be sent the exercise files:

Place these files in any directory you wish. Don't look at the sgml2html.pl.

Note: keep in mind that this is only a "filtering" exercise. This is provided only to help you get started thinking in perl regexps and thinking about the issue of displaying SGML on a web page. Filtering is only one small part of DLXS software and of what we'll be covering in class. We still need to talk about searching, retrieving, packaging up results, how to combine results once filtered into HTML templates, subclassing for behavior changes, configuration of DLXS software, and so on.

Your assignment, should you decide to accept it, is to write a perl script that takes test.sgml as input and produces a file like test.html as output.

Study the sgml file to get a sense of its structure. Some information about it is given below. View the html file in a browser and also look at its source to give you an idea of what the output should be.
The general rules are:

  1. display the TITLE element from the HEADER as the title of the html window and as an H1 in the html itself
  2. A DIV1 may contain a DIV2 (There are only two levels of DIVs. ). Always print the AUTHOR and the HEAD information for each DIV.
  3. Indent DIV2s using "blockquote"
  4. print all lines from the poems
  5. display in the html the line numbers of the lines (N attribute in the L tag)
  6. Only when you are done with your first take or two of the program, are you "allowed" to look at the source of the perl file
  7. Have fun.