FIGURE Resolution

Introduction

This document describes the mechanisms and programs used by DLXS for accessing and displaying images that are defined by the FIGURE element in TextClass document markup.

The FIGURE Element

The FIGURE element is used in TextClass markup to encode the occurrence of a figure in a text document. The FIGURE element's ENTITY attribute carries the ID that the middleware resolves to an image file on disk or to an image under management by ImageClass.

If the document is a TEI Level 1 (page image document viewable in Pageviewer) the FIGURE tag supplies additional data about the bitonal page image beyond that supplied by the PB tag. The additional data might be a second continuous tone scan of the entire page or a continuous tone scan of a plate within the page. The middleware makes the contone images viewable under ImageClass via links built in the full OCR text of the document (in cases where that OCR is displayable) or via additional links in Pageviewer. For higher TEI levels, the FIGURE tag typically calls out the occurrence of an inline figure, the image for which is usually on disk but may also be retrievable from ImageClass through the IdResolver mechanism described in the IdResolver section of the Pageviewer documentation.

The balance of this document describes how the FIGURE element is resolved into an image via two mechanisms: Filesystem resolution and IdResolver resolution. Also described is how the resolution can be affected by writing subclasses of TextClass.

FIGURE Resolution in General

The resolution mechanism is summarized in the following diagram

<FIGURE ENTITY="abc"> → transform "abc" to key → lookup key → URL or file system path

The document markup is parsed and the ENTITY attribute value of a given FIGURE tag is extracted. The attribute value is transformed into a key/path suitable for lookup via the DLXS IdResolver or by looking on the disk. If IdResolver is used, the corresponding ImageClass URL is returned. If the filesystem is used, the path to the file in the web space is determined. This URL/path becomes the value of a new attribute which is added to the FIGURE tag and passed along for eventual processing by the XSL stylesheet (principally text.components.xsl). The XSL stylesheet typically transforms the FIGURE tag together with its attributes into an HTML img tag, possibly wrapped with an HTML anchor (a) tag, if a popup window or link into ImageClass is required to view the image of the figure.

Actually, depending on the situation, the middleware adds two or three new attributes to the FIGURE tag for XSL processing. They are FIGTYPE, HREF_1 and HREF_2. Collection-specific configuration referred to below is discussed in the configuration section later in this document.

  1. FIGTYPE carries one of two values:
  2. HREF_1 is the URL or web space path to the inline image when FIGTYPE=INLINE or to a thumbnail image of the full sized image when FIGTYPE=THUMB.
  3. HREF_2 is the URL or web space path to the full sized version of the figure image when FIGTYPE=THUMB.

Note: The debug=resolver URL parameter can be added to the end of the URL to see the action of the resolver as it operates on the ENTITY attribute of the FIGURE tag.

The IdResolver Resolution Mechanism

If the configuration indicates figure images are "external", i.e. that the figure images are managed in ImageClass or by a 3rd-party host, the IdResolver is used to resolve the ID to an ImageClass or 3rd-party URL. The Idresolver mechanism is described in the IdResolver section of the Pageviewer documentation.

The Filesystem Resolution Mechanism

If the configuration indicates figure images are not "external", the figure images should be found as files in the web space. The middleware constructs a default path to the image in the web speace as /webdir/images/ENTITY.extension where webdir is the collmgr value for the collection and extension comes from a list (.gif, .jpg, etc.). The middleware tests for file existence in the web space for each extension until a hit occurs. This allows files of several different formats to coexist in the web space.

The DLXS directory convention is to store these image files in DLXSROOT/img/c/coll and make a symbolic link to that directoy in DLXSROOT/web/c/coll called images.

Default Behavior and Custom Configuration

The default for the figure resolution mechanism assumes all figures are inline, on disk, without corresponding thumbnail images. Within this constraint it is possible to change the way the path to the disk file is generated to derive a number of naming conventions based on the bare entity attribute value. This is described below.

Modifying the described configuration of the figure resolution mechanism is accomplished by writing a subclass of the TextClass package for each affected collection. The size of the methods that need to be written is small, typically just a line or two of code.

Following is a synopsis of the methods provided for subclassing. Please consult the code in DLXSROOT/cgi/t/text/TextClass.pm and its subclasses for more detail.