The following is copied from OAIS v1 (2002) and may be out of date

1 National Archives and Records Administration’s ELECTRONIC AND SPECIAL MEDIA RECORDS SERVICES DIVISION

1.1 DOMAIN

Domain and Consumers

The Electronic and Special Media Records Services Division is the organization within the U. S. National Archives and Records Administration (NARA) that appraises, accessions, preserves, and provides access to federal records in a format designed for computer processing. NARA serves as the archives for the records of the United States federal government. Consumers for this data are as diverse as the electronic records they seek to access and range from individuals seeking to assert their rights to other government agencies to academic researchers, private consultants, media personnel, and a wide variety of other users.

Data Producers

Originally this data is produced (created or received) by agencies of the U.S. federal government (producers). The data may concern virtually any area or subject in which the government is involved. They may come from a variety of computer application such as data processing, word processing, computer modeling, or geographic information systems. They can include records made directly by government employees or indirectly through government grants and contracts.

Special Features

The most noted special feature of NARA’s Electronic Records program is the diversity of the collection of more than two billion logical data records in over 129,000 data sets from more than 100 bureaus, departments, and other components of executive branch agencies and their contractors and from the Congress, the Courts, the Executive Office of the President, and numerous Presidential commissions. A small portion of the data originally were created as early as World War II. An even smaller portion contains information from the nineteenth century that has been converted to an electronic format. Most of the data, however, has been created since the 1960s. The major types of holdings and subject areas include agricultural data, attitudinal data, demographic data, economic and financial statistics, education data, environmental data, health and social services data, international data, and military data.

Scientific and technological data already transferred to NARA include the National Register of Scientific and Technical Personnel; the National Engineers Register; the 1971 Survey of Scientists and Engineers; major portions of the National Ocean Survey’s Nautical Chart Data Base; numerous Environmental Protection Agency series relating to pesticide use, hazardous wastes, and pollution abatement; the Nuclear Regulatory Commission’s Radiation Exposure Information Reporting System; biometric data sets and epidemiological studies (such as the National Collaborative Perinatal Project) from the National Institutes of Health, the Centers for Disease Control, and the National Center for Health Statistics; and text from presidential commissions on Three Mile Island, coal, and the Space Shuttle Challenger Accident. NARA recently ingested the e-mail of the Executive Office of the President, including the White House Office and the Office of the Vice President for the period from 1986 through January 20, 2001. While NARA’s scientific and medical holdings are rich and varied, they do not fully reflect the extent and diversity of federal activity in this area.

1.2 INGEST

The ingest process begins with producers (records managers and records creators in federal agencies) inventorying all electronic records and determining how long to retain the records for current agency business. The next step in the process is for the producer and NARA to develop a Request for Records Disposition Authority, Standard Form 115 (SF 115), the formal submission agreement for all federal records. Here information on the content, retention and disposition, and the availability and extent of documentation and related reports is listed in the context of the producer’s business needs for the information. Data with continuing value are listed as permanent and the timing and frequency of their transfer to NARA is established. The producer submits the SF 115 to NARA for its review and appraisal. NARA appraises electronic records items on all SF 115s. Identifying permanently valuable electronic records for retention by NARA’s Electronic and Special Media Records Services Division involves cooperation between NARA and the producers. Through the process of scheduling and appraisal, NARA identifies and selects the electronic records it judges to have enduring value. NARA evaluates electronic records in terms of their evidential, legal, and informational value and their long-term research potential. Some of the factors in this appraisal evaluation include estimation of past, present, and probable future research value within the context of the data’s origin and current use and its impact on federal programs and policy. Administrative and legal value, as well as the potential for linkage with other data, may bear on the decision. Unaggregated microlevel data sometimes has the greatest potential for future secondary analysis. Once NARA determines that the records have enduring value, it then determines whether the records should be preserved in electronic format.

Submission Agreements

The actual Submission Information Package (SIP) between NARA and the agency that creates or receives the data is a Request to Transfer, Approval, and Receipt of Records to the National Archives of the United States, Standard Form 258 (SF 258) accompanied by the data object(s) and sufficient documentation and descriptive information to use the data. The SIP transfers physical and legal custody of the electronic records from the producer to NARA. This agreement is the end product of the ingest process described above. The SF 258 also contains any restrictions on access to the data that conform with exemptions listed in the U.S. Freedom of Information Act (FOIA). NARA enforces all legitimate restrictions on access. At the same time NARA also works with the producer to determine if any “disclosure-free” version of the data can be produced for consumers.

Typical Delivery Session

This inventorying, scheduling, and appraisal process specifies the data object(s) and related metadata and documentation to be transferred, and establishes the timing and frequency of submissions. Specific instructions for how the data are to be organized and when they should be submitted are established in the Code of Federal Regulations (36 CFR 1228.188). All data should be transferred on either open reel magnetic tape, tape cartridges, or CD-ROM. NARA negotiates acceptance of other forms of magnetic media such as class 3590E or DLT, with producers. The CFR sets the specific technical requirements in terms of format, block size, and extraneous characters. While the current regulations also require that all SIPs should be transferred in a software-independent format, NARA staff recognize that the research potential and utility of some data would be significantly reduced if they were transferred in such a format. In such cases NARA works with the producers to determine the best mode of transfer.

What are the Information Objects that are Delivered? Producers typically will transfer a series consisting of one or more data sets with the related documentation which minimally should include the record layout and codes, methodology statements, technical information about the data including number of records and size. Ideally, the SIP also includes associated analyses and reports. Increasingly agency-created metadata also is included. The majority of electronic records come as flat files of data; increasingly, however, text files and output from data base management systems, and geographic information systems also are transferred.

What are Collections? NARA organizes all Archival Information Collections (AIC) on the basis of Provenance and Original Order. Provenance maintains the identity of an Archival Information Package (AIP) or an AIC and preserves as much information as possible about its origins and custodial history. Within NARA this is accomplished through the use of Record Groups which reflect the structure of the federal government and subgroups and sub-subgroups which place the AIPs and AICs within the producer’s place within its agency. Original order argues for maintaining the contents of an AIP or AIC in the order developed and used by the producer. This helps reveal the producer’s organization and how it used the data objects and can provide additional information to consumers. For electronic records, “original order” is expressed in the logical structure of files and databases and in the indexing which the producer used. Within NARA the basic unit for arrangement and description is the series which is an AIC that can include a number of related AIPs.

What Descriptive Information is Provided? The extent and quality of the descriptive information provided by the producer varies from quite sketchy to extremely detailed. NARA staff attempt to flesh out the producer-created descriptors with AIC level descriptions, title list entries, abstracts, and Dissemination Information Packages (DIP) and to provide the descriptive information in a variety of formats to reach different consumers.

What sorts of Validation Objects are Provided? Producers are required to transfer metadata and descriptors adequate to access, process, and interpret electronic records. For formatted data files the DIP must include a record layout with appropriate field definitions and codes. It frequently also includes methodology statements, input documents, data entry instructions, processing directions, sample outputs, reports and analyses of the information and system manuals.

What Transformation Processes are Performed Prior to Storage

What Metadata is Created? The most extensive metadata product created by NARA is the DIP. It includes an Introduction which can discuss the origin, creation, and administrative uses of the data object(s), list related objects that are or will be available, and discuss characteristics of the data that could cause problems for consumers based on initial validation processes. The DIP also can include sample printouts of the data and tables and reports related to computer verification of the data. NARA also captures metadata on record layouts, domains, ranges, and links between files in a metadata database as a byproduct of the automated verification process. Other metadata created by NARA staff include AIC descriptions, formatted abstracts, title line entries, and collective descriptions which place the data in a broader context. Increasingly, producer-created metadata is part of the SIP transferred to NARA.

What Validation is Performed? NARA’s initial ingest procedures include creating a new preservation master and backup copy of each data object on new certified media to ensure the best physical media for long-term storage. This procedure includes a 100% byte for byte comparison between the SIP media and the AIC media. At this time staff perform automated verification of the data contents with the record layout and codes, and of the physical structure including the number of records, blocks, and bytes. Staff also perfect the DIP to facilitate secondary use of the data.

Security

All AICs are maintained off-line. Consumers access only DIPs. The AIC preservation master and backup copies are maintained in separate secure stacks at two different physical locations. AICs that require additional security measures, for example Bureau of the Census data subject to restrictions imposed under Title 13 of the United States Code and national security classified information restricted under Executive Order, are afforded the appropriate level of protection. NARA is moving to provide enhanced access to selected data onsite by providing reference copies on a wider variety of media and by providing a broader range of services and output products. This may include use of vendors who can provide enhanced access to the holdings utilizing “value-added” services.

1.3 INTERNAL FORMS

How do you Store your Data? All preservation master and backup copies of AICs are stored on newly certified class 3480 magnetic tape cartridges. Some of the holdings have not yet been migrated from nine-track, 6250 bpi open-reel magnetic tape. Data are received and stored temporarily on other media including diskettes, 4mm, 8mm, CD-ROM, DLT, and various removable hard drives, although not all of these media conform with regulatory requirements.

Migration (Data). Based on recommendations from the media manufacturers, the National Technology Alliance, the National Institute of Standards and Technology, and various standards organizations, NARA has been migrating its AICs to new class 3480 magnetic tape cartridge when each media unit is ten-years old. NARA continues to reassess storage media. NARA anticipates storing larger AICs on class 3590E and/or DLT cartridges as appropriate.

Migration (Metadata). Metadata has been stored in a variety of formats depending on the original format transferred with each AIC. Traditionally most metadata existed in textual format. The metadata captured in the verification process is maintained in a relational database. There are no current plans for migrating from this format, although the metadata can be exported in flat file format. NARA encourages data producers to create and transfer metadata in electronic form. In the near future NARA will begin scanning and digitally converting metadata so it can be preserved and provided in an electronic format along with the data.

Migration (Format). The Code of Federal Regulations requires data producers to transfer all data in ASCII or EBCDIC with all extraneous characters removed from the data except record length indicators or tape marks and blocked at no higher than 32,760 bytes per block for open-reel and 37,871 bytes for class 3480 magnetic tape cartridge. When CD-ROM is used they must conform to ISO 9660 standard and the data must be in discrete files containing only the permanent data. Additional software files and temporary files may be included on the CD-ROM. The CFR also requires all electronic records to be transferred in a software-independent format. NARA works with data producers who cannot meet those requirements to determine the most appropriate transfer and storage formats.

1.4 ACCESS

What Finding Aids are Provided?

Information about the holdings are available in multiple levels of detail and by multiple sources as a way to provide various consumers with information about NARA’s holdings. The least specific detail is available in the 1996 three volume Guide to Federal Records in the National Archives of the United States where AICs are described in the context of the larger holdings from a producer. Other collective descriptions include Information About Electronic Records in the National Archives for Perspective Researchers, General Information Leaflet 37, which also is available on the Division’s homepage (http://www.nara.gov/nara/electronic), and a title list of data sets available on the Division’s homepage and as a printout. Specific AIC descriptions were created as formatted metadata for a portion of the Division’s holdings for inclusion in a proposed automated description data base which has not been implemented. The most detailed description for any data set is the DIP. Each DIP may contain a narrative describing the data file(s), the record layout and codes for the data, a methodology, sample input forms and questionnaires, annotations regarding the data validity, and a bibliography. The Division also has established an email site (cer@nara.gov) for queries regarding its holdings and services.

Security.

All NARA holdings are maintained in environmentally controlled closed stacks which are accessible only by NARA staff. Master and backup copies of the data are stored in separate vaults in separate locations to facilitate disaster recovery. The Division’s national security classified data sets are in separate environmentally controlled stacks approved for the storage of classified information. All processing is performed in limited access processing rooms at NARA or at other government computer centers. Computer processing is done on closed systems which require both a registered logon and personal identification number or password to access the system. Researchers do not have direct access to any AIC. Presently they access copies of the data that they have purchased for their own use.

Customer Service/Support.

The Division has a staff dedicated to providing reference services to the public and to the staffs of other federal agencies. The staff responds to both general and specific inquiries by telephone, letter, email, or in-person visit and fills orders for copies of specific data and their DIPs. For a limited number of AIPs the staff also provides information from records to respond to researcher requests. The staff also functions as a filter between researchers and the data producers when problems develop in understanding or interpreting the data. The staff develop a variety of informational material about NARA’s holdings and services, much of which is available online.

Do You Support Subscriptions?

NARA accepts standing orders (subscription) for electronic records that it receives on a regular, periodic basis from producers of the Federal government. Under current NARA regulations all subscriptions must be prepaid prior to shipment.

What Media/formats do you use?

Currently NARA provides DIPs on a variety of magnetic media including nine-track open-reel magnetic tape or class 3480 magnetic tape cartridges encoded in ASCII or EBCDIC, labeled or unlabeled and written to the maximum block size requested, diskettes for smaller data sets and CD-ROM . NARA also can provide an exact copy of records in nonstandard formats, if the agency transferred them this way, but it cannot validate or verify the contents of these files. In the past these other formats have included packed decimal, zone-decimal, binary, National Information Processing System (NIPS), Statistical Analysis Software (SAS), Statistical Package for the Social Sciences (SPSS), or OSIRSIS. On-line transfer of SIPs to NARA via File Transfer protocol (FTP) was implemented in 2001; providing DIPs via FTP may occur as early as 2002.

What Transformation (Value Added) is Provided?

NARA currently preserves AICs as received from the producers; it does not routinely provide customized DIPs or other value-added services beyond computer verification of the AIC contents and enhanced documentation. Planned enhancements include value-added services such as custom DIPs and data transformation.

Pricing Policies.

NARA uses a cost-recovery fee schedule developed by the National Archives Trust Fund. The 2001 charges include a basic order handling fee of $89.00 with an additional fee of $9.00 for each file. Media costs range from $2.50 for diskette to $22.50 for a 9-track open reel. Paper reproductions cost $10.00 for a minimum order of up to 20 pages; additional pages are $0.50 per page. If the documentation is on microfiche, copies are $2.50 per fiche.

Dissemination Security.

The same security considerations developed in relation to Access apply to Dissemination. NARA’s national security classified data is made available only to researchers who have both the appropriate security clearances and the appropriate need-to-know. Other restricted data are made available only with prior written approval of the creating agency or under the terms of the restrictions which must be supported as a legitimate exemption under the Freedom of Information Act.

1.5 SPECIAL CHARACTERISTICS

NARA’s Electronic and Special Media Records Services Division has a diverse collection which reflects the diverse activities of the federal government. The staff shape the holdings through the process of scheduling, appraisal and accessioning. Currently, NARA acquires less than one percent of all federal records created in an electronic format. The timing of the transfer of electronic records from the creating federal agency to NARA is negotiated with the creator to ensure that the records are available for agency use for as long as necessary for current business and that they are transferred to NARA as soon as practicable to ensure their long term preservation for secondary use. NARA is the only federal agency with an explicit archival mandate for Federal records and thus the only Federal agency that preserves and provides access to a wide range of historically valuable records for the indefinite future. As such it is an archives of last resort for the electronic records of some federal agencies which undertake an active data dissemination function while there is a researcher interest in the data but whose mandate ceases or may cease once the demand wanes or ceases.

This topic: Main > WebHome > OaisImplementations > NaraDescriptionfrom2002OAIS
Topic revision: 28 Mar 2025, DavidGiaretta
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding OAISCommunity? Send feedback