Image Class DTD Description


The Image Class DTD was designed specifically with information retrieval as the primary purpose. It is not very useful for general storage of image database data. It structures data for fast retrieval and filtering. Frequently used data is replicated for ready accessibility. For example, a database record that references 3 image files will be replicated 3 times in <I> (item) elements. The 3 instances are then clustered within an <ENTRY> element so they can be treated either as a single record or as 3 independent search results. The <ISTRUCT> element serves multple purposes. It summarizes the most essential data points of an item so that items can quickly and easily be retrieved and displayed. This is very useful when displaying all of the images associated with a record/ENTRY because it eliminates the need to retrieve and filter the full items for each image.


The <COLL> element provides an essential wrapper around the collection-specific SGML file and provides the primary location for the collection ID, which is replicated strategically through the SGML for easy access by the middleware.

 The <GEN> element, for "general", holds information that applies to the entire collection, and which is often utilized by other elements.

The RULES, PROGR, DATA, and DATE attributes of the <GEN> element provide basic adminstrative data about the method used to create the SGML file. This data is not used in the information retrieval process.

 The <ADMIN> elements, meaning "administrative," provide unique, abbreviated representations of the Administrative Field Mappings. These field mappings are used to flag certain critical fields of the database, such as the fields that hold image filenames and record ID numbers. Most of these mappings are only utilized in the process of generating the SGML. For example, the IS.fn mapping is used in the generation of the ISTRUCT element's <M> attribute (image file reference).

The <META> elements establish unique abbreviations for the "meta" fields. Collection-specific fields are mapped to the meta-fields as part of a high level field mapping process that ultimately enables cross collection searching.

 The <BASE> elements establish unique abbreviations for the collection-specific field names. The abbreviations are used within the <ITEM> elements to label fielded data.

 The <ENTRY> element provides the outermost structure for the representation of a data record. The collection identifiers are replicated here to facilitate retrieval and filtering from SGML to HTML for display. In cases where the record has multiple images associated with it, the <ENTRY> element binds together the information for each image, which is wrapped in the <I> (for "item") elements.

 The <ENTRYAUTH> element aids in determining what groups of users are allowed to access the large size images associated with the record. It applies to all items in the <ENTRY> and is critical for authorization mechanisms of the CGI. See also Image Class Access Control Summary and Examples Table.

 For every image of a record there is an <I> element, meaning "item". The attributes of the <I> element are essential to the information retrieval and the process of filtering data for display to the user. Each <I> element of a an <ENTRY> has the potential to be a single search result. The point being that while an <ENTRY> is a complete representation of the original data record, the <ITEM> is the unit retrieved and displayed to the user by the middleware.

The <INO> element is identical to the <I> element except that it is not searched nor displayed as a result. It is only retrieved and displayed as a related image to the record. The most common situation is where there is an overview image for a record as well as several detail images, and where it is preferable to only display the summary image as a search result.

The <ISTRUCT> element holds a summary of the most essential information about the image. This includes the image filename, caption, and whether or not the image file was present in the file system at the time of encoding. Additionally the ISTRUCT element holds structural metadata that describes the visual relationship of the image to the other images of the entry. For example, if the images of the entry combine to form a tiled matrix representing a larger whole, this is stored in the structural metadata. The structural metadata attributes are STID, FACE, STTY, X, and Y and they are described more fully in the document Image Class - Mapping Image Structures.

 The <D> element provides a wrapper around the descriptive metadata of each item.

 The <C> element, for "category", tags the fielded descriptive content of each record. The attributes utilize the unique abbreviations that were established in the <GEN> element in order to provide logical mappings of collection-specific fields to the general "meta" fields used for cross collection searching. The attributes are essential to formulating queries on specific fields, such as a search of the "Title" field. Furthermore, the abbreviations also appear in the HTML interface templates and are used by the middleware to populate placeholders with data.

The <CNO> element is identical to the <C> element, though it is for fields that should not be searched. Situations exist where the content provider would like the field to be displayable, but not searchable.