Tentative Schedule

(each morning runs from 9 am to 12pm with a break,
each afternoon session from 1:30 pm to 4:30 pm with a break)

For a more detailed view of topics, see the Course Outline

Day 1


  1. Introductions and course objectives
  2. Overview
    1. Process Overview
    2. Environment
    3. Document Classes
    4. Directory Structure
    5. Data Preparation and XPat Overview
    6. Text Class Components
    7. Image Class
  3. Environment details
  4. Directory Structure details


  1. Image Class
    1. Installation and Configuration
    2. Image Class Access Restrictions
    3. Discussion of Approaches to Batch Image Processing
    4. Image Processing Software Links

Day 2


  1. Data Preparation (Part 1): Encoding & Transformation
    1. Data Sources
    2. GUMS/TextClass
    3. In line markup (text munging vs. new XPat functionality)
    4. Unnumbered, nested, identical elements
    5. SGML Tools
    6. Transformation
    7. Normalization
    8. TermMapper & Fabricated regions
    9. Levels of Encoding
  2. Installation of DLXS TextClass Middleware & Content
    1. extract tar files (Hands On)
    2. edit configuration files (Hands On)


  1. Makefile
  2. Normalization (Hands On)
  3. XPat Search Engine
    1. History of the software
    2. Indexing
      1. SGML text indexing (Hands On)
      2. Region Indexing (Hands On)
    3. Query Language (Hands On)
    4. Fabricated regions (Hands On)

Day 3


  1. Related / Derivative Data
    1. TextClass
      1. CollDb
      2. Mapper
    2. Pageview Data Preparation (Hands On)
      1. Background
      2. pageview.dat files
    3. WordWheel Data Preparation (Hands On)
      1. History
      2. Wordwheel data creation
  2. Program Architecture
    1. text-idx
      1. Functional Requirements
      2. configuration files


  1. Program Architecture continued
    1. text-idx continued
      1. Objects used
      2. URL parameters
      3. text-idx walkthrough

Day 4


  1. Program Architecture continued
    1. text-idx continued
      1. walkthrough as needed
  2. User Interface issues
    1. TBD
  3. pageviewer-idx
    1. Background and overview
    2. PageView object
    3. Creation of pageview.dat file (Hands On)
    4. walkthrough


  1. ww-idx
    1. WW object, and others
    2. XPat indexed word data
    3. walkthrough
  2. Subclassing the TextClass
    1. examples

Day 5


  1. Q&A
  2. The Future