DLXS Middleware Library Modules

This document describes the DLXS Perl Library Modules used by the DLXS middleware. Unless otherwise noted, any file with a ".pm" extension is an Object Oriented Perl module. Modules are marked with the classes that use them.

Global configuration

These modules contain global configuration variables shared by all DLXS middleware.

LibGlobals.cfg

Configuration file for certain global variables shared by all DLXS modules. LibGlobals is primarily concerned with collection database configuration. An important variable is $gDatabaseSelector which defines the type of collection database storage (e.g., CSV, MySQL, etc.) the middleware is to use.

LibVersion.pm

Not object oriented.
Holds version information for the shared code found in files in the $DLXSROOT/lib directory. This allows other Perl modules to require specific versions of the DLXS Library modules as a group.

BookBag modules

These modules are related to the BookBag feature of the DLXS middleware.

BookBag.cfg

This file contains variable definitions for the use of the BookBag related modules. The variables configure the SMTP host value for the mail server and the "From:" address that is to appear in the email generated by the "Mail Bookbag Contents" feature.

BookBag.pm

The base class BookBag. This container class implements the behavior of a persistent group of records that the user chooses to save from the results of searches. In TextClass, the persistence is accomplished by saving the BookBag object within the DlpsSession object and saving that object to a persistent repository such as a file or database. Class methods include those that handle adding items to, and deleting items from, the Bookbag, emailing Bookbag items, etc.

BookBagItem.pm

BookBagItem is a base class representing items stored in the BookBag object. The two principal methods are GetItemHeaderAsText and GetItemHeaderAsHtml. The former generates a textual version of the record stored in the BookBagItem object for downloading. The latter generates filtered HTML for display within the BookBag window. These filtering tasks differ depending on the application and these differences are expressed in the subclasses of BookBagItem (see following). Note: ImageClass stores item IDs in the BookBag object rather than having its own BookBagItem subclass.

BookBagItem/

Directory to hold subclasses of the BookBagItem base class.

BookBagItem::BBItemBC.pm

The subclass of BookBagItem for use with BibClass.

BookBagItem::BBItemTC.pm

The subclass of BookBagItem for use with TextClass.

Database access and collection resolution

The CollsInfo and GroupsInfo classes in this category provide an abstract API to the collection database. DbUtils is non-object-oriented and provides a thinner interface modeled more closely on Perl DBI. The more abstract interfaces in the CollsInfo and GroupsInfo classes are based on the DbUtils interface.

The CollsInfo and GroupsInfo objects provide interfaces to the Collection rows and the Groups rows of the database.

DbUtils.pm

Not object oriented.
This module groups together a number of utility subroutines used by the middleware to connect to and interact with a SQL database via the Perl DBI.

CollsInfo.pm

The base class CollsInfo. An object of this class is created by the CGI program (usually via CioFactory) to maintain information about collections available for searching. When a CollsInfo object is created it checks for a CollsInfo object cached on the DlpsSession object and reuses it if available or discards a cached CollsInfo object and rereads the database into a new CollsInfo object if the user's list of authorized collections has changed since the database was last accessed. The CollsInfo object is passed a list of authorized collections from the environment. If a collection row read from the database is not in the list of authorized collections, the row is not saved in the object. If an authorized collection does not appear in the database the CollsInfo object changes the authorized list by reference. Finally, the CGI object's list of requested collections is similarly modified to reflect just those collections that are authorized and exist in the database. The principal method of the CollsInfo class is GetCollKeyInfo which returns a single field's value for a given collection based on the name of the field.

Another function of the CollsInfo object is to instantiate and maintain per-collection objects (such as those defined by TextClass) and provide access to them. This function is supported by the AddClassObjects method.

GroupsInfo.pm

Encapsulates information from the groups table in the DLXS database. This information includes, for example, which collections belong to which groups. The GroupsInfo object performs some validation on the groups it reads from the database by not storing any groups which consist entirely of unauthorized collections and removing collections from stored groups that are unauthorized. Authorization is defined as belonging to the list of authorized collections passed in from the environment after the list has been processed by the CollsInfo object creation.

CioFactory.pm

Contains methods to drive the creation of a CollsInfo object and GroupsInfo object from the collection database. The new method creates these objects and attaches them to the DlpsSession object. The CioFactory is also responsible for creating an active database handle by connecting to the database. It passes the handle to the CollsInfo and GroupsInfo objects it instantiates and saves the handle on the DlpsSession object for general use. Another function of the CioFactory is to determine the cross-collection mode and store it on the DlpsSession object. The cross-collection mode is based on whether one or more than one collection is being handled (either via a simple list of collections or as a group of collections).

Session management

This group of modules and related files are used to implement session data that can persist for some defined period of time or indefinitely. Data that typically have a finite but short life span are the SearchHistory and the BookBag in TextClass. Arbitrary data accessible by key may be stored for convenience on the DlpsSession object. The storage can be persistent (available during the course of a session) or transient (not saved from CGI invocation to CGI invocation -- data can be treated as transient simply to have a convenient place to store it during the course of one CGI run).

DlpsSession.cfg

The configuration file for the DlpsSession module. An important global variable in this file is %gSessDatabaseConfig which configures the session backing store for the MySQL database. Usernames and passwords are also configured here.

DlpsSession.pm

The DlpsSession object. This object, which is a wrapper for the Apache::Session object, allows for Perl OO syntax access to the Apache::Session tied hash. Principal methods on this object are SetPersistentSessionItemByKey and GetPersistentSessionItemByKey which support saving and retrieving persistent data to the backing store. The DlpsSession object contains an internal CioWrapper object which allows multiple CollsInfo and GroupsInfo object to be saved and retrieved for the support of multiple applications that typically each have their own collection metadata. This mechanism is part of the cross-application functionality.

CioWrapper.pm

CioWrapper is a container object encapsulated by the DlpsSession object used to aggregate several CollsInfo and GroupsInfo objects and provide access to them in a single package

CreateSessionTable.txt

This is a SQL script used to create a proper table in a SQL database to support DlpsSession sessions.

Application Class Modules

The modules in this section are for the support of the DLXS "Classes" (e.g., Text, Image, Bib) as Perl OO-classes so that they can be subclassed to modify behavior and instantiated as objects. This architecture allows the CGI layer to be very thin and supports the creation of cross-class CGI application functionality such as searching and displaying results from more than one application class.

DLXSApp.pm

The base Application class from which the principal DLXS Classes are derived. Its subclasses include BibApp, ImageApp, and FullTextApp, which in turn is subclassed into TextApp and FindaidApp. DLXSApp provides a few methods which are common to all DLXS applications. For more information about the full class hierarchy, see: DLXS Object Hierarchy.

(Document) Class Related Modules

The modules in this section have methods that represent the behavior specific to the types or classes of "objects" delivered through the DLXS Middleware (e.g., Text, Finding Aids, Bib). These are also OO classes so that they may be easily subclassed for the purpose of handling collection specific behavior. Note: ImageClass does not currently participate in this class hierarchy. This hierarchy contains code for storing and retrieving collection data (including other objects) as well as code for searching and filtering. For more information about the full class hierarchy, see: DLXS Object Hierarchy.

DLXSClass.pm

This is the superclass for all collections. One subclass of this is FullTextClass, of which TextClass and FindaidClass are subclasses. BibClass is also a subclass of DLXSClass.

ItemView.pm

The ItemView object holds data from the Pageview and ArticleClips tables in the dlxs database (a table containing metadata about all page images available for a particular XML file).

XPAT search and result modules

Modules in this section are concerned with constructing, organizing, submitting and retrieving queries using the XPAT search engine.

SearchSet.pm

The SearchSet object encapsulates queries that will be sent in a group to an XPat session. Each search is given a label by the CGI to identify the kind of information that will be returned. When the searches in the search set are sent to XPAT, this label is returned along with the results for that search query. In this way, the type of results can be known by the CGI, primarily so that decisions can be made about how to filter the results. Search sets are grouped and identified by a name. This is done so that, if needed, more than one group or set of searches can be manipulated during the course of one CGI invocation.

XPat.pm

XPat.pm contains code to handle create an XPat object. The first thing this object does is open up of an XPAT session for a particular TextClass instantiation. This entails forking off a process for a collection that is local to the machine from which the request comes or opening up a socket connection to a remote machine, if the collection data resides on a different machine.

The middleware then uses this object to interact with an XPAT process. There is one XPat object per XPAT dd file used by a given collection. Through this object's methods the middleware can submit searches to XPAT and receive results from it. The principal method is GetResultsFromQuery which accepts a single query string in XPAT syntax and returns an XPAT result encapsulated in an XPatResult object. Another commonly used method is GetSimpleResultsFromQuery which returns the result in raw XPAT result syntax.

XPatResult.pm

An object of this class contains the results returned from one XPAT search. XPatResult is basically a container class which organizes an XPAT result by parsing the result into a record containing the raw XML data and the data byte offset. The contents of the XPatResult object are retrieved by initializing an iterator (InitIterator method) and calling the GetNextResult method.

XPatResultSet.pm

A container object that maintains a set of XPatResult objects in a special form. When an XPatResult object is added to an XPatResultSet object it loses its individual identity and becomes part of the set organized by byte offset.

The results are grouped by the same name identifying the SearchSet that led to the results. Each result consists of three pieces of information (which are acquired, parsed and separated by the XPat object's AddResult method) plus another separate value. These four pieces are: the index containing the result (in case there are multiple indexes per collection), the byte offset of the result in the particular index's data, the label given by the search, and the raw XML returned by the XPAT query (in the case of XPAT pr requests) or the number of matches (in the case of simple requests), and a reference to the XPat object that was used to get the results.

The GetNextResult method on XPatResultSet returns all XPAT results added to it in byte offset order. There is also support for iteration over results by different sorting orders.

RemoteConnect.pm

Contains methods used by the middleware to connect via a socket to the dlxsd daemon running on a remote host so that the remote host may run XPAT for the requesting machine. This class never needs subclassing.

TerminologyMapper.pm

Encapsulates the mappings between labels and other terms for a given collection. These mappings are configured in a collection's map file by the collection implementor.

QueryFactory.pm

A module that can, using a TerminologyMapper object and a CGI object, construct a basic XPAT query for simple, boolean and proximity searches.

Support modules

Modules in this section provide a variety of utility routines shared by all DLXS applicatons.

DevUtils.pm

Not object-oriented
A non-OO Perl module which groups together utility subroutines used by DLPS in development work. These allow, for example, the ability for different programming staff members to maintain individual working directories, apart from the directory holding the release version of the software. Another example is support for collection authorization in an easily editable file instead of a database.

DlpsUtils.pm

Not object-oriented
A grouping of utility subroutines that are used throughout the DLXS middleware, including, for example, a routine to strip leading and trailing spaces from a string, finding the minimum or maximum of an array of values, error display, etc.

ObjectFactory.pm

ObjectFactory is a factory class that manages the creation of various DLXS objects whose instantiation may require investigation of the collections in the current request to determine which subclass of the object to create. If all collections in a request are configured to use a single subclass of a given Application, that subclass is instantiated. Otherwise the base application class (e.g. TextApp) is instantiated. ObjectFactory is used principally to instantiate TextApp, BibApp and ImageApp or subclasses thereof.

Miscellaneous modules

ProcIns.pm

Contains methods to handle Processing Instructions found in HTML templates.

ApplicationResult.pm

This module provides a global array into which a given application class, such as TextClass, may insert the IDs of search results. This array is managed for the purposes of sorting and efficient navigation over the list of results. It alos provides mechanisms to allow more than one class to insert result IDs thereby laying the groundwork for integrated cross-class results display and navigation within a unitary user interface. This object also has mechanisms to permit it to be saved on the DlpsSession object and so act as a result cache to improve performance when switching between views of the same results.

SearchHistory.pm

Keeps track of a user's searches during the course of a session. The list of previous searches can be recalled at any time. From the displayed list, clicking on any previous search will resubmit that search. The SearchHistory object is saved with the DlpsSession object.

WW.pm

An object of this class encapsulates data from a wordwheel, built in a separate process.

ww2.cfg

Configuration file for the WordWheel module.

roman_numeral.pm

Not object oriented.
This module includes subroutines to convert strings representing Roman numerals to Arabic numerals and vice versa. Used mostly by pageviewer.

AuthNZ.pm

This module defines an object-oriented class that implements methods to

  1. recognize one of several possible appropriate authentication and authorization modules
  2. load that module dynamically
  3. determine whether a requested resource is authorized by that module
  4. create side-effects in the AUTHZD_COLL and PUBLIC_COLL and REMOTE_USER environment variables and store information in the dso to record this authorization
  5. provide services supporting the creation of login and logout URLs

The AuthNZ class instantiation as an object (anzo) encapsulates and records the results of the authentication and authorization process, This object is attached to the session object (dso) for later reference. The public interface to this module consists of the following routines

PIFiller.pm

This module defines the parent class for class level PI Fillers.

XsltPIFiller.pm

This is the derived class of PIFiller containing methods to handle the filling of global-level Processing Instructions.