Description of Metadata - ALA Archives Digital Collections
 
 

The following list describes the metadata used for describing the ALA Archives Digital Collections, along with the guidelines for describing each of them. This metadata scheme was derived primarily from the VRA (Visual Resource Association) Core Categories with several modifications to accomodate all possible collections in the Archives and to be able to map to Dublin Core Metadata Element Set. The list is divided into three categories based on the type of metadata. For further information on metadata types see the Metadata Types Table of "Moving Theory into Practice: Digital Imaging Tutorial."



Descriptive Metadata | Administrative Metadata | Structural Metadata | Metadata in CONTENTdm | Metadata Summary in Table Format

DESCRIPTIVE METADATA

Title. The Title ofthe picture, not the Caption of the picture written on the photographic paper. The Title should be clear and adequate to describe the content of the picture. If a picture doesn’t have any clear description or no description at all, make up one. Example: A picture of some or a group of librarians with “Colorado” as the only written description on the photograph. Make the “Colorado” as the Title but add more description. A user reading the Title without seeing the picture will think it as a picture of a place in Colorado. Add description such as “Group Picture” which make the Title to be “Colorado – Group Picture,” which is much more descriptive to the users. Group Picture is defined as pictures with 3 or more people, taken intentionally in such a way so that the people on the pictures are identifiable.

Translation Title. This field is to be used when the original Title is not in English. Entries in this field should always be in English. The content of this field serves as finding aid only in the "Search Across All Fields" option.

Alternative Title. Other Title information that needed to be recorded but doesn’t need to appear on the Title field information. The content of this field serves as finding aid only in the "Search Across All Fields" option. It is not intended to be displayed to the user along with the picture.

Note: The Translation Title and Alternative Title fields are currently not being used. However it is recommended to keep both fields for future use. Future collections might need these fields. While it is also true that these fields have not much use in the current system (CONTENTdm) due to the nature of the system (text or plain database), they will be very useful in a relational database system. They can be utilized to display information about the Title in a much more flexible way.


Type. Type field describes the type of resource of the Original Work, instead of the Digitized Material OR the Digital Manifestation of the Work. Entry options for this field are: Collection, Dataset, Event, Image, Interactive Resources, Service, Software, Sound, or Text (options in Boldface are the most commonly-used options). These options are derived from the Dublin Core Metadata Initiative Type Vocabulary.

Digitized Material. Describes the name (designation) for the type of the (physical) material being digitized. Entry options for this field are: Black and white photographs, Black and white panoramic photographs, Black and white slides, Black and white negatives, Black and white postcards, Black and white transparancies, Color photographs, Color panoramic photographs, Color slides, Color negatives, Color postcards, Posters, or Manuscripts. For other options refer to Getty's Art & Architecture Thesaurus.

Example for Type and Digitized Material entries:
Additional considerations for Type and Digitized Material entries: Sometimes a picture can fall into the category Image or Event. To determine the most suitable category, determine the main/overall and dominant intention of the picture. Year Coverage. Contains only Numeric value of the year the picture was taken. To give approximation of the year use the format "xxxx ca.", where xxxx is the four digit of the year. When no information available on the year the picture was taken, leave the field blank.

PeriodName Coverage. Contains Alpha Numeric value of the name of the period that is being represented in the picture. Use only popular Period Name (World War II, Civil War, etc.) based on the Library of Congress Subject Headings.

Geographic Coverage. Contains the Geographic Name of the Location where the picture was taken. It can also be name of places (name of hotels, conference centers, designated names of a location, etc.). Don’t use the abbrieviation used in the AACR2R (Ill., N.Y., Cal., etc.). Use the full name of the place instead. This is due to the fact that users will likely to use the full name to search, instead of the AACR2R abbreviations. Provide possible variation of Geographical Names (Saint Louis and St. Louis, etc.). Try to conform to the Library of Congress Subject Headings as much as possible. Provide additional entries when necessary. Example: For a picture taken in Narragansett Pier, Rhode Island, give entries for “Narragansett Pier” AND “Narragansett, Rhode Island.” Don’t forget to separate each entry with a semi-colon (no space before, but give one space after) when typing them in the database. Do NOT put semi-colon after the last entry since it will appear on the web. This is the format used in CONTENTdm.

Names Coverage. Contains Personal Names related to the resources. Corporate Names can also be entered here as long as the names as a whole do NOT reflect any geographical designation, in which case should be entered in the Geographic Coverage field. Don’t forget to separate each entry with a semi-colon (no space before, but give one space after) when typing them in the database. Do NOT put semi-colon after the last entry since it will appear on the web. This is the format used in CONTENTdm.

Subject. All entries in this field should NOT reflect Period Name, Geographical information, and Personal/Corporate Names (they have been represented by the Period Name Coverage, Geographic Coverage, and Names Coverage fields). Refer to Library of Congress Thesaurus for Graphic Materials I: Subject Terms (TGM I) to learn how to assign subject terms. Don’t forget to separate each entry with a semi-colon (no space before, but give one space after) when typing them in the database. Do NOT put semi-colon after the last entry since it will appear on the web. This is the format used in CONTENTdm.

Description. The Description field's value functions as a Caption for the picture. However, any additional information can be recorded in this field too. Don’t forget to transcribe the written information (handwritten, typed, or computer generated) that appear on or along with the photograph OR on the document accompanying the photograph. Transcribe as much information as possible relating to the photograph and/or the negative of the photograph. Don’t forget to look at the back of the photograph. There might be some information written there. Provide neccessary additional information to make the content clear to the users. For example: providing order of appearance for individuals appear on the photograph (Left to Right, First Row, Seated, Standing, etc.).

Creator Personal. Contains information on the personal name of the resource's creator. Don’t forget to separate each entry with a semi-colon (no space before, but give one space after) when typing them in the database. Do NOT put semi-colon after the last entry since it will appear on the web. This is the format used in CONTENTdm.

Creator Corporate. Contains information on the corporate name of the resource's creator. In a case where there might be both personal and corporate creators, a Corporate Creator is preferable to Personal Creator. Example: Frederick W. Faxon had a company named after himself (F. W. Faxon). In a case like this it is recommended to use "F. W. Faxon" as an entry in the Creator Corporate field, especially when a corporate contact information is available. Don’t forget to separate each entry with a semi-colon (no space before, but give one space after) when typing them in the database. Do NOT put semi-colon after the last entry since it will appear on the web. This is the format used in CONTENTdm.

Creator Role.  Defines the role of the Creator Personal OR Creator Corporate. Entry options for this field are: Authors, Correspondents (correspondence writers), Correspondents (reporters), Editors, Event Organizers, Graphic designers, Photographers, Publishers, or Owner. Note that the Publisher role in this field is different from the Publisher field (see information on the Publisher field). In the case where each personal name has different role, enter the role information in the corresponding order as the entries for Creator Personal OR Creator Corporate field. In the case where more than one creators have the same role, multiple entries of the same value are still needed in this field. In short the number of entries in Creator Role field has to correspond to the number of entries in Creator PersonalORCreator Corporate field. Don’t forget to separate each entry with a semi-colon (no space before, but give one space after) when typing them in the database. Do NOT put semi-colon after the last entry since it will appear on the web. This is the format used in CONTENTdm.

Creator Contacts. Records the contact information of the Creator Personal OR Creator Corporate. Information entered into this field will be treated as free text.

Theme. Contains the Series Title of the materials being digitized. It is recommended to use a slightly different Series Title than the one currently used in the Archives for the sake of clarity and access.

Sub Theme. Contains the Box Title of the materials being digitized. It is recommended to use a slightly different Series Title than the one currently used in the Archives for the sake of clarity and access. Since it is possible that the Box Title covers multiple subjects, it is recommended to use only single subject to represent a group of materials and use another subject to represent another group of materials, rather than use multiple subjects entry to cover all of them. Example: The materials for the Library Building Photographs theme (Series Title, 99/1/15) consists of several boxes. Each box covers several different states. Box 2 has the title of "Alabama - District of Columbia." In this case it is recommended to break the Box Title into single state name to represent all photographs and postcards form a particular state. Hence the Sub Theme "Alabama" should be used for all photographs and postcards about Alabama only, and so on.

Note: Although Theme and Sub Theme fields can be represented by the the combination of Group Record, Subgroup Record, Series Record, and Box Record fields, these fields can be very useful for grouping materials of the same theme and sub theme (the aboutness) for online exhibition or browsing purposes. A resource can be assigned multiple Themes and/or Sub Themes, which will give us much more flexibility in arranging (grouping) the collections logically. This feature is very important since archival materials are mostly arranged based on Provenance and Original Order, which do not necessarily mean grouping materials of the same aboutness together. Using Theme and Sub Theme fields we will be able to virtually create a custom 'online exhibition' for users. For further information on this subject please refer to the Technical Documentation about Creating Predefined Custom Searches in CONTENTdm.
Inclusive Dates. Contains the date (year only) range of the materials being digitized. The Inclusive Dates information found in the Series Title can be used when it represents the real Inclusive Dates of the whole Sub Theme's materials being digitized. Otherwise enter the values by examining the earliest and latest date (year only) of the materials being digitized for a particular Sub Theme.

Language. Contains the Language used in the resource. For visual materials, use the language used in the written information on the visual materials or accompanying document(s) of the visual materials. If no information available use the default value "English." Refer to "ISO 639. Code for the representation of the names of languages" for list of languages (use the Language Name column).

Work Dimensions. Contains the 'physical' dimensions of the material being digitized in centimeter (cm). Use the W x H (Width x Height) format for uniformity. To determine which dimension is the width and height of the material, use the text or visual orientation of the material as a reference.

Medium. Refers to what type of material the 'material' (resource) being digitized is made of. Entry options for this field are: Black-and-white film, Color film, Photographic plates, Photographic paper, or Paper (fiber product). For other options refer to Getty's Art & Architecture Thesaurus.

Technique. Refers to the technique used to produce the item (resource) that is being digitized. Entry options for this field are: Handwriting,
Drawing (image-making), Photocopying, Photographic processes, Black-and-white photography, Color photography, Aerial photography, Art photography, Astronomical photography, Digital Imaging, Documentary photography, Infrared photography, Stereoscopic photography, Time-lapse photography, or Underwater photography. For other options refer to Getty's Art & Architecture Thesaurus. Use only one technique that is dominant. Example: A black and white aerial photograph of UIUC campus has the "Aerial Photography" as the dominant technique, instead of Black-and-white photography.

Back to Top
 

ADMINISTRATIVE METADATA

Publisher. This field contains the corporate name of the organization that creates and/or publishes the digital surrogate of the resources on the web. For ALA Archives materials use "University of Illinois at Urbana-Champaign Archives - ALA Archives."

Digital Dimensions. Refers to the dimensions of the digital surrogate of the digitized material (resource). Entry options for this field are: Width x Height pixels (for visual materials, including digitized manuscripts) and minutes (for audio and video recordings). Use the Metric Systems. Example: 21 x 15.5 cm, 23 minutes.

Format. Refers to the format of the digital surrogate of the digitized material (resource). Entry options for this field are: JPG, TIF, GIF, WAV, MP3, RAM, MPG, etc. For other options refer to the Information Sciences Institute's Media Type list.

Resolution. Refers to the 'resolution' of the digital surrogate of the digitized material (resource). Entry options for this field are: x dpi (for photographs or other visual materials), x Hz (for audio recordings), or x frame/sec (for video recordings).

File Size. Refers to the size of the digital file (the digital surrogate produced by the digitization process). The file size is in bytes. It is a numeric only filed. Do not use any punctuation symbol (comma, period, etc.). Example: 30525 bytes.

Date Created. Refers to the date of the creation of the digital surrogate. Use the mm/dd/yyyy format for this field. Example: 08/23/1970, 06/07/1981, etc.

Resource Identifier. Refers to the File Name of the digital surrogate. Use only the File Name without the file extension. Use only lower case. In creating file names, use a naming system that reflects the structure of the collection. Example: For the Faxon Collection the naming system used is "ala-ggbbsss-b-xxx" (e.g. "ala-9901014-5-104"), where ala represents the ALA Archives, gg = Record Group Number, bb = Record Sub Group Number, sss = Record Series Number, b = Box Number, and xxx = incremental integer number. This naming system was used since the whole Faxon  Collection is contained in a single box and each photograph is already numbered. However there are some photographs that are not numbered but have the year information on them. In this case, the following naming system is used: "ala-ggbbsss-b-yyyy_xxx" (e.g. "ala-9901014-5-1916_01"). There are also some photographs that have no information. In this case the naming system used is "ala-ggbbsss-b-unknownxxx" (e.g. "ala-9901014-5-unknown1"). For the Library Building Photographs collection the naming system used for postcards is "ala-ggbbsss-tt-ppp_xxxf" (e.g. "ala-9901015-mi-bw_01f"), where tt = Abbreviation for the State name, ppp = Type of Photograph (bw = Black and White, col = Color), xxx  = incremental integer number, and f = the side of the postcard (f = Front, b = Back). The naming system used for photographs is "ala-ggbbsss-tt-ppp_xxx" (e.g. "ala-9901015-il-col_01"). Note that there is no Box Number included in the naming system. This is due to the fact that the whole collectio spans over several boxes and each box can contain postcards and/or photographs from more than one state. These are just some examples of the file naming systems that can be used. Other file naming system can be implemented. However it is recommended that the "ala-ggbbsss" section is retained for consistency.

One obvious advantage of using a file naming system consistently is providing a context for each digital file. A user might download a digital file and in the future will ask for any information regarding that particular digital file. If we implement a consistent file naming system that reflects the context we will still be able to know where that particular digital file came from in the system (assuming that the user does not change the file name) and provide the user with the information he/she needs.

Copyright. Provides information on the copyright status of the material being digitized. For consistency, some templates have been created to provide information on the copyright status. It is recommended to use the templates as much as possible. Any specific information regarding the copyright status of an item (resource) can be recorded in the Copyright Notes field.

Copyright Notes. Holds specific copyright information that cannot be accommodated in any of the general copyright information templates available. This field also records any specific copyright information that will only be used as the Archives' internal notes. The information contained in this field will not be displayed to the users. The content of this field is only accessible by the server administrator or from the CONTENTdm Acquisition Module.

CD Volume. Contains the Volume Name of the Compact Disc or DVD used to store the Full Resolution images (TIF images; 600 dpi) of the digitized resources. Like the Resource Identifier field, it is very important to implement a consistent naming system to make access to the physical Compact Disc easier.  It is higly recommended to retain the "ALA-ggbbsss" format for consistency. Example: The Volume Names for F.W. Faxon Collection are ALA-9901014-5-1, ALA-9901014-5-2, etc., where the last digit is incremental and represent a series of Compact Discs that hold the Full Resolution images of F. W. Faxon Collection. The Volume Name for Library Building Photographs are ALA-9901015-1, ALA-9901015-2, etc.

Back to Top
 

STRUCTURAL METADATA

Group. Holds the numeric value of the Record Group Number of the item (resource) being digitized. Refer to ALA Archives Record Group hierarchy to learn more about the organization of the ALA Archives at the University of Illinois at Urbana-Champaign.

Sub Group. Holds the numeric value of the Record Sub Group Number of the item (resource) being digitized.

Series. Holds the numeric value of the Record Series Number of the item (resource) being digitized.

Box. Holds the numeric value of the Box Number of the item (resource) being digitized.

Folder. Holds the alpha numeric value of the Folder Title of the item (resource) being digitized, including the date (year) information.

Back to Top
 

METADATA IN CONTENTdm

Implementing the metadata scheme explained above in CONTENTdm requires us to define the following attributes:

All the above options can be defined using the Collection Administration function of the Acquisition Module of CONTENTdm. Only the server administrator has the authority to make any modifications on the settings. Contact Nuala Bennett (nabennet@staffillinois.edu) for further information.

Back to Top
 

METADATA SUMMARY IN TABLE FORMAT

Table 1. ALA Archives Digital Collection Metadata Set in CONTENTdm Format
Field Name
Dublin Core
Mapping
Data Type
Large Field
Searchable1
Hidden2
Controlled
Vocabulary3
Title
Title
Text
No
Yes
No
Yes
Translation Title
Title
Text
No
No
No
No
Alternative Title
Title
Text
No
No
No
No
Type
Type
Text
No
Yes
No
Yes
Digitized Material
Description
Text
No
Yes
No
Yes
Year Coverage
Coverage - Temporal
Text
No
Yes
No
No
PeriodName Coverage
Coverage - Temporal
Text
No
No
No
Yes
Geographic Coverage
Coverage - Spatial
Text
No
Yes
No
Yes
Names Coverage
Subject
Text
Yes
Yes
No
Yes
Subject
Subject
Text
Yes
Yes
No
Yes
Description
Description
Text
Yes
Yes
No
No
Creator Personal
Creator
Text
No
No
No
Yes
Creator Corporate
Creator
Text
No
Yes
No
Yes
Creator Role
Creator
Text
No
Yes
No
Yes
Creator Contacts
Creator
Text
Yes
No
No
No
Publisher
Publisher
Text
No
Yes
No
Yes
Theme
Description
Text
No
Yes
No
Yes
Sub Theme
Description
Text
No
Yes
No
Yes
Inclusive Dates
Date
Text
No
Yes
No
No
Language
Language
Text
No
Yes
No
Yes
Work Dimensions - Measurement
Description
Text
No
No
No
No
Medium
Description
Text
No
Yes
No
Yes
Technique
Description
Text
No
Yes
No
Yes
Digital Dimensions - Measurement
Description
Text
No
No
No
No
Format - Measurement
Description
Text
No
Yes
No
Yes
Resolution - Measurement
Description
Text
No
No
No
No
File Size - Measurement
Description
Text
No
No
No
No
Date Created
Date - Created
Date
No
Yes
No
No
Resource Identifier
Identifier
Text
No
No
No
No
Copyright
Rights
Text
Yes
No
No
No
Notes - Copyright
None
Text
Yes
No
Yes
No
Group - Record
Source
Text
No
No
No
No
Sub Group - Record
Source
Text
No
No
No
No
Series - Record
Source
Text
No
No
No
No
Box - Record
Source
Text
No
No
No
No
Folder - Record
Source
Text
No
No
No
No
CD Volume
Identifier
Text
No
No
No
No

1Searchable: The field name will be available for user to select in the drop-down list in the CONTENTdm Search Interface. A Non-Searchable field cannot  be used to search the collection using the Query Builder method either.
2Hidden: The field name and content will not be available or visible to users.
3Controlled Vocabulary: The field's value will be indexed and a list of all available entries will be available at the CONTENTdm Acquisition Module.

The settings for each metadata can be modified through the Collection Administration option under the Administration menu of CONTENTdm Acquisition Module. However you will need the proper security level to do that. If the Acquisition Module does not have the required security level to make any modification, contact the server administrator (Tim Cole) to request the modifications. For further information on field properties go to CONTENTdm Online Help for Field Properties.

When starting a new collection, it is recommended to use the ALA/UIUC Archives Digital Collection Worksheet to assist the data entry for the first 20-25 records. The worksheet will help us to get the 'feeling' of the various values that will be used in that particular collection since it is easier to compare pages of record in printed form rather than on screen. It will help us to be more consistent in assigning values to various fields.

Back to Top

Composed by: Aditya Nugraha (anugraha_a@yahoo.com)
Created: July 2003
Last updated: August 01, 2003

Project Background | Project Summary | Desciption of Metadata | Technical Documentation | Search Collections | CONTENTdm Tutorial
Back to DIMTI Projects Page