Pageviewer data

  1. pageview.dat file Structure
    1. filename (actual TIFF file name)
    2. seq (absolute page sequence number within the text item)
    3. pagenum (original text's page number; e.g., 34, xvii, 42a, etc.)
    4. confid (OCR confidence value)
    5. feature (e.g., Table of Contents, Advertisement, Unspecified, Index, etc.)
  2. Creation
    1. run the pageview target in the main Makefile
    2. runs the makepageviewdata.pl script