| |
|
|
The Florida Historical Legal Documents collection grew out of a project to convert Florida Territorial Laws to full-text by the Legal Information Center at the University of Florida's Fredric G. Levin College of Law. The Collection's managers intend to build from Florida Territorial Laws to document Florida's legal heritage through the State's laws, opinions and findings, briefs and reports, cases, etc.
OVERVIEW AND WORKFLOW
The Florida Historical Legal Documents collection is established under the Florida Heritage Collection model. Contributing institutions select resources for digitization and subsequently perform or outsource the digitization and create files of structural metadata describing the relation of images to logical parts of the resource. The structural metadata record and the set of images for each resource is transmitted to the Florida Center for Library Automation (FCLA), where the data is loaded into a DB2 application on a central Unix server. Identifiers which serve the function of persistent URLs pointing to the DB2 application are inserted into the catalog records, which are used for name and topical access to the electronic resources.
Conversion to text with mark-up is the Collection's tertiary task. The libraries perform or outsource the conversion to ASCII text and mark-up, using a subset of Text Encoding Initiative (TEI) tags optimized for FCLA FullText Collections. Marked-up texts are transmitted to FCLA, with the text is loaded into the FullText Collections server running XPAT 5.0 software, distributed by the University of Michigan's Digital Library eXtension Service (DLXS).
RESOURCE DESCRIPTION -- CATALOGING & BIBLIOGRAPHY
Participating libraries are responsible for creating full MARC catalog records for selected materials from their own collections. Catalog records form the core bibliography. Cataloging records are maintained in a union database of all Florida Historical Legal Documents collection materials at FCLA and records for digitized resources are also contributed to the OCLC WorldCat.
Cataloging is expected to adhere to guidelines developed by the Technical Services Planning Committee Cataloging and Access Guidelines for Electronic Resources (CAGER). The guidelines specify that records should represent the electronic versions only, and include specific instructions to:
- Put the date of the original in Fixed Field Date1, the date of digitization in Date2, and use Form of Reproduction "s";
- Include a title (245) subfield h to indicate the resource is electronic;
- Specify the digitizing institution and date of digitization in the imprint (260);
- Include a series statement (830) for the Florida Historical Legal Documents collection, justified by a general note (500);
- Use an original version note (534) to record the location of and publication information for the source document.
Complete MARC cataloging instructions can be found in the CAGER Guidelines.
IMAGE CAPTURE AND CONVERSION
Participating libraries are responsible for clearing copyright and subsequent digitization of selected materials from their own collections. Each library may perform its own digitization, or contract with a vendor or with another SUS library for digitization services. Image capture must adhere to the standards promulgated by the Cornell Department of Preservation and Conservation (see Digital Imaging for Library and Archives, Kenny and Chapman, 1996). A Quality Index of 5 or better for visual images is required.
Three types of images are created for all textual materials in the collection: TIFF, JPEG and PDF. A TIFF and JPEG image is created for every page; related sets of pages (e.g. chapters or articles) are bundled into PDF files. Participating libraries create TIFF and JPEG images and submit them to FCLA, which subsequently creates a PDF derivatives.
TIFF images are created as the direct result of scanning source materials (that is, as the native file format), using a variety of scanning hardware, primarily flat-bed scanners. TIFFs are archived as uncompressed electronic masters. Bit-depth is appropriate to the source and its anticipated use, and may be bitonal, 8-bit grey, 24-bit color, or greater. Color images are created and maintained in the sRGB color-space. Both grey and color images are calibrated and scanned to within the tolerances promulgated by the Library of Congress for the American Memory project. Images created from microfilmed sources reflect the quality of the source microfilm.
TIFF images are used to create JPEG derivatives using Adobe Photoshop or Cerious ThumbsPlus in a batch executable process. The TIFF image is resized setting the width to 630 pixels and the height accordingly. Creation of PDF files is a function performed by locally written loader software. The loader calls LeadTools custom ActiveX control to open sets of JPEG images, and then uses Thomas Mertz's PDFLib software to build the PDF.
Text-based versions, whether encapsulated with PDF, HTML or other mark-up, are produced either by re-keying from source documents or by optical character recognition (OCR) of TIFF images. A minimal accuracy rate of 99.995% is required. Mark-up employs a subset of Text Encoding Initiative (TEI) tags optimized for FCLA FullText Collections.
RESOURCE DESCRIPTION -- STRUCTURAL METADATA
A file of structural metadata is created for every document to indicate the relationship between the physical units of digitization (TIFF, JPEG and other images) and the logical units of publication (pages, chapters, and other parts). The metadata format used is a locally created document type definition, the MXF.
For each electronic resource (book volume, journal issue, manuscript, etc.), the MXF file:
- identifies and names the image files comprising the resource,
- defines the order of images,
- identifies and names the subsections (such as chapters),
- says which images belong to particular subsections, and
- establishes the order and hierarchy of subsections.
IMAGE LOADING, STORAGE and NAVIGATION
For each volume that is digitized, a directory containing structural metatdata files and a set of images and/or text files is sent by FTP from the contributing institution to FCLA. The metadata and images are processed by a locally written loader, which first checks that all the image files referenced by the MXF file are present, copies the images into a Florida Historical Legal Documents collection directory, and loads the structural metadata into DB2 tables maintained on a Unix server. If instructed, the loader will also create derivative formats such as PDF files.
Once structural metadata is loaded and images are moved to the appropriate directories, access and navigation is provided by another locally written DB2 server program.
Persistent URLs referencing the server application are created by program and inserted into the bibliographic record describing the resource.
RETRIEVAL
The cataloging records describing Florida Historical Legal Documents collection resources are loaded into a shared central library management system, a locally developed application based on NOTIS, on an IBM mainframe. The records can be searched through the SUS Libraries' online catalog application, WebLUIS. All traditional catalog access points are available (author, title, subject, etc.).
Once records are retrieved, the URLs in the bibliographic record are used as hotlinks to the DB2 server application, which initially presents a Table of Contents display for graphic formats and which presents text selections display for full text format.
FUTURE DIRECTIONS
Participating libraries will continue to contribute materials to Florida Historical Legal Documents collection as funding becomes available.
|
A State University System of Florida PALMM Project Contact us
|
![]() |