Last updated 2003-12-01 11:19:27 EST
Doc Title Image Class Data Requirements
Author 1 Weise, John
CVS Revision $Revision: 1.6 $
Image Class Data Requirements

Each collection used in the Image Services system must have a unique abbreviation. The abbreviation is used in many places in the system.


Unique (across all classes) collection abbreviation (lowercase alphabetic characters, only)

Long collection name


Museum of Art


Special Collections Library


French Architecture

Each record must minimally have fields of the following type:


Unique identification of the record within the collection

Image Filename(s)

The image filename including (master) filename extension. This may be a repeating field. (Not required if there are no images)

Image Caption(s)

Describes the specific view depicted in the image file. If the Image Filename field is repeating, then the Image Caption field(s) may be repeating also (though it is not a requirement). There may be multiple image caption fields. In many cases, and especially when there is only one image file per record, many or all of the fields of the record might be considered "caption" fields. In such a case it usually works best to consider only the fields that describe the view depicted in the image to be caption fields. For example, a good caption field would be one that has data similar to "view from the south" or "verso" or "aerial view".

Remember, this document is about minimal requirements. Please read Image Class - Mapping Image Structures for full coverage of image file topics.


Requirements for identifiers loosened up with DLXS 10. The Image Class DTD was changed to allow a wider range of characters in IDs. Previously there were significant limitations on the characters that are allowed within SGML IDs. Unique record IDs in image databases can take many different forms and include many different characters. Even though a wider range of chanracters are allowed, the provided script ("idb") for preparing Image Class data continues to filter illegal SGML ID characters into legal logical representations of the character in order to ensure legal SGML IDs. This practice has continued in order to maintain backward compatability for databases that pre-existed the change.

Image Class uses the right square bracket "]" as a delimeter within ids which are a concatenation of the record ID and the image filename (when present). Other use of the square bracket in IDs and filenames is therefore problematic.

About Images

A database does not have to have digital images associated with it. It is acceptable for a database to not have an Image Filename field.

Any given record in a database may have 0, 1 or multiple image files associated with it.

Other Fields

Typically there are many other fields in a database. This is allowed.

Field Names

Each field is given an abbreviation and label in Image Class. The abbreviation must be a legal MySQL field name. The label can contain ASCII Latin 1 characters.

Multiple Uses

In some cases a field has potential to serve multiple requirements. In the most extreme example, a database might have just one field, "ID". The ID might also be the Image Filename, and the Image Caption. Of course this would not likely be very useful for searching, but it makes the point. The more common example is where image files are named by accession number. It is acceptable for a single field to serve more than one of the minimal field requirements, however it is absolutely critical for there to be no ambiguity in the use of the field for multiple purposes. For example, if data in an accession number field are to be used for filenames as well, then the actual image files must be named exactly as the accession numbers. Use of IDs as filenames is becoming less feasible as Image Class gains support for new media formats. It is a good idea to have separate ID and image filename fields, and theimage filenames should include extensions. The extension should match the master image file format, which may be different than the format used on the server.

Character Sets

All characters in the database must be 8-bit ISO 8859 Latin 1 characters. For some collections this may be rather restrictive. At this time, characters that are not compliant with the standard will be converted to "?" (a question mark character) and not searchable properly. They also will not display properly (the question mark character will display as a place holder). Some characters may slip through the filters and generate errors at the time of SGML validation, though in most cases the errors may be ignored with the understanding that search, retrieval and display will be affected. Sometime your only choice is to ignore the errors if it is not possible to determine what the appropriate character should be.