![]() |
|
HEP
Libraries Webzine
Issue 3 / March 2001
Anita Hollier (*)
Abstract
The article looks at the changing role of the archivist in the electronic
age. It considers the new challenges posed, and some of the ways in which
they are being met. Physical threats to records include technology obsolescence
and physical deterioration of media. But a more fundamental issue that
archivists and records managers have had to face is to reconsider what
it is that constitutes a genuine 'record'. Only then can policies be put
in place to ensure that they are preserved for as long as they are needed.
Introduction
Over the last 50 - 100 years technological advances have changed the
way we handle data to such an extent that we can now speak of a real 'information
revolution'. Simple and efficient methods now exist to create, retrieve
and share information. However, there are some corresponding headaches
for those in professions dedicated to the preservation of records. As Jeff
Rothenberg famously said "digital information lasts forever - or five years,
whichever comes first" [1]. This article aims
to summarise some of these issues, and to consider the ways in which archivists
and records managers are having to rethink their profession in order to
meet the new challenges. While these changes are not painless, they bring
many benefits in new ways of working and closer relationships with colleagues
in related professions. The way forward in this field is certainly through
international and cross-disciplinary cooperation, and a number of models
are already emerging.
The challenge
Preserving access to the record
Increasingly librarians and archivists are finding their roles converging
as both take on responsibility for the long-term preservation of digital
materials. So although I will speak mainly about archives it is fitting
to begin by referring to a study commissioned by the Research Libraries
Group (RLG), which stated "digital materials, regardless of whether they
are created initially in digital form or converted to digital form, are
threatened by technology obsolescence and physical deterioration.[2]
" The first of these threats arises from the fact that electronic records,
unlike their counterparts on paper, are not directly readable by humans.
They are machine-dependent, and if the hardware and software required to
read them is no longer available they cannot be accessed. The classic example
is that of the 1960 US Census records. In 1976 the National Archives identified
some of this data, which the Census Bureau had retained in what it regarded
as 'permanent' storage, as having long-term historical value. However,
by that time the UNIVAC type II-A tape drive needed to read the tapes was
long obsolete. Often some sort of data rescue can be achieved (as was finally
the case for most of the 1960 Census records), but it may be very expensive
and there can be no guarantee that all the information will be recovered.
The media themselves (tapes, disks, etc) are inherently less stable than
traditional materials. Whilst paper or parchment can survive ‘benign neglect’
for many years, magnetic media deteriorate more quickly and need to be
regularly checked and refreshed. Electronic documents are also vulnerable
to accidental damage; they can be accidentally deleted in a keystroke.
Preserving a genuine record
The International Council on Archives defines a record as "a specific
piece of recorded information generated, collected or received in the initiation,
conduct or completion of an activity, and that comprises sufficient content,
context and structure to provide proof or evidence of that activity.[3]"This
rather long definition was developed precisely because of the new challenges
posed by electronic media. With the advent of computers it is very easy
to create and store information and documents that do not have the characteristics
of a genuine 'record'. In this definition, the content refers to the actual
data, words, numbers,
sounds, etc, produced by the creator of the record. The structure refers
to the relationships between these data, for example which words represent
a heading, or an address, or the body of the text. It also includes the
links between different files that may make up one record, for example
in a relational database. The context provides information about the activity
and circumstances which gave rise to the record, who produced it, why it
was produced, how it was used, how it relates to other records, etc. Of
these, the context is often the most difficult to preserve in an electronic
document.
Recognition of the need for contextual information has long been at the heart of archival principles. Most published items, such as library books, contain a certain amount of explanatory information as well as their main content, for example there will be the name of the author, a publication date, and probably some sort of general introduction which gives some idea of why the book was produced and for what sort of audience it is intended. An individual archival record usually lacks this information; it is simply a piece of recorded information that was produced in the course of some activity. Hopefully an office memo made sense in the context in which it was created - the creator and other users understood what it was about and what role it played in their business. But how much sense is it going to make, by itself, to a historian consulting it in an archive 100 years, or even 10 years, later? For this reason archivists try to catalogue each record thoroughly, giving particular importance to its provenance. For this reason also, the original order of records is respected when they enter an archive. The way the records were arranged during their active life (e.g. the office filing system) may not be the most convenient way of arranging the data for future historians but it does give invaluable information on the context in which the different items were created and how they relate to each other. This would be lost if the records were rearranged.
So, for paper records, preserving record systems can provide valuable contextual information that will help the records to be interpreted in the future. But one of the great advantages of electronic documents is that they do not need to be stored in one physical place. They are easily shared between work groups, and virtual documents can be created 'on the fly' to satisfy each individual's requirements. This means that the sort of contextual information that was intrinsic to most paper records is often lacking in the electronic environment. If it is not actively provided, for example by the provision of adequate metadata, the documents can be difficult to use for current business and may lose their value as archival records.
Once a valid record has been created it must be preserved as a valid
record for as long as it is required, and here the issue of authenticity
is crucial. Another great advantage of electronic records for business
purposes is that they can easily be edited and updated, but it is important
to keep track of these changes in order to guarantee that the record really
is the genuine article. It is very easy to alter electronic documents without
anyone being aware of the fact. Of course, paper records can be forged
too, but this is harder to do and easier to detect. This concern is clearly
shown in the uncertainly surrounding the legal admissibility of electronic
records, but in fact this is only one instance of the general rule that
a genuine record must provide valid evidence.
Meeting the challenge
Nature of the record
A number of projects have addressed the issue of what constitutes a record. In 1993 the University of Pittsburgh School of Information Sciences began a research project to examine variables that affect the integration of recordkeeping requirements in electronic information systems. [5] The major objectives of this research project were to develop a set of well-defined recordkeeping functional requirements, satisfying all the various legal, administrative and other needs of a particular organization, which can be used in the design and implementation of electronic information systems. In September 1996 the project published a Framework for Business Acceptable Communications, including Functional Requirements for Evidence in Recordkeeping, and Metadata Specifications based on them. Its specifications have been adopted as the basis for a number of archival electronic recordkeeping implementations, for example at Indiana University.[6]
At around the same time, a research project by the University of British Columbia's School of Library, Archival and Information Studies aimed to identify and define the requirements for creating, handling and preserving reliable and authentic electronic records.[7] The UBC Project researchers worked in close collaboration with the U.S. Department of Defense Records Management Task Force to identify requirements for Records Management Applications (RMA). The resulting standard (DOD 5015.2) [8] is now in use by the U.S. Defense Information Systems Agency to certify RMA vendors. InterPARES (International Research on Permanent Authentic Records in Electronic Systems) is the second phase of the UBC Project, intended to address the long-term preservation of inactive electronic records (i.e. records which are no longer needed for day-to-day business but which must be preserved for operational, legal, or historical reasons)[9] .
Technical issues
This is intended only as a very brief overview. More detailed information is available elsewhere [10]; in particular, guidelines for storage and physical conservation are well covered [11]. There are three main technical strategies to address digital preservation, and each has its supporters and critics and its own advantages and disadvantages (Jean Marie Deken gives a good introduction to this and other matters concerning archives and records management in the electronic environment) [12].
Policy issues
Since good information management is fundamental to all organisations it is important for them to ensure that their corporate policy is clear and well communicated, and that there is adequate funding and expertise available to implement it. Where many organisations are facing the same challenges it makes sense for them to try to work together, and this is particularly true in the archive profession where the individual units can be quite small. In the UK, funding from Resource (the recently formed Council for Museums, Archives and Libraries) has supported the production of the Preservation Management of Digital Materials Workbook [13]. This is currently available as a pre-publication draft, and aims to identify good practice in creating, managing and preserving digital materials, and to provide practical tools to assist in that process. Amongst other things it describes the sort of issues that should be addressed in formal corporate policies, such as the status and definition of the electronic record, and relevant roles and responsibilities within the organisation. It then treats corporate strategies in more detail:
"The following issues should be addressed in corporate strategies and may well require other supporting documents setting out in more details how the strategies can be achieved:
Changing approaches
What's different now?
From a preservation point of view a major difference between paper and digital documents is the time scale in which decisions and action must be taken. Actions and decisions that used to be taken towards the end of a document's life, where the choice was between 'destroy' or 'send to the archive', now typically need to be taken at the moment when the document is created so that it can be actively managed throughout its life-cycle. Indeed the whole concept of a 'life-cycle' has been called into question as inappropriate to the electronic environment, where records move into and out of active use much more often [15]. Likewise it is no longer clear that the best way to preserve a record is for it to enter the custody of the archive. Sometimes it may be more appropriate for the archive to have 'intellectual' control of the record, but for it to be physically located and preserved elsewhere [16].
Another change is that many more stakeholders are now involved in the
long-term preservation of a record. The creators of the documents, the
IT specialists who manage them, as well as archivists, records managers
and librarians all have a role to play and their immediate concerns and
priorities may be very different. Further confusion may arise from differences
in terminology which can make it difficult for them to cooperate effectively;
for example, words such as 'record' and 'archive' have clear but quite
different
meanings to archivists and IT specialists. Some misunderstandings are
more subtle, for example 'metadata' is similarly understood by both, but
tends to have a broader meaning to archivists than in the IT world where
the emphasis is more on automatically captured data to aid resource discovery.
It is worthwhile for collaborations to spend some time at the outset establishing
a common vocabulary.
What has to change?
The most important step to meet these changes is for archivists and records managers to get more closely involved with the other professionals involved in electronic recordkeeping. One way to facilitate the involvement of all relevant sectors is for an organisation to establish a cross-disciplinary committee to develop its electronic recordkeeping policies and objectives. This committee can then allocate implementation tasks wherever the relevant skills exist. This cooperation can function in-house, between organisations or internationally (e.g. the multidisciplinary DLM Forum organised by the European Commission) [17] .
Such cooperation will also allow archivists and records managers to be involved at the point of creation of the record (which has long been their aim in theory, but too seldom put into practice) rather than the point of destruction. Records appraisal now needs to be built into recordkeeping systems at the design stage. This goes hand in hand with a move towards functional appraisal, which uses tools such as business process modelling which were previously more familiar to systems designers than to archivists. This method makes explicit the business processes and information flow in the unit, then identifies the points at which a record (i.e. evidence) is required. The National Archives of Australia have adopted this as the basis of good recordkeeping [18]. The traditional records management approach has been to conduct a record survey to see what documents exist then to draw up retention schedules to specify for how long each of these series of records must be kept. The schedules are usually organised by record series and often a different one is produced for each organisational unit. This means that they quickly become out of date when any of these factors change, so some records managers have long been advocating functional appraisal [19]. The new type of retention schedule for electronic records will reflect a more 'top down' approach and will contain mainly policy decisions about what activities and transactions the organisation needs to keep evidence of. These requirements will normally remain valid for some time. They will be supported by implementation guidelines, i.e. the detail of exactly how this will be achieved, and these will change as technology or organisational structure changes. The records inventory is still useful in drawing up these guidelines, and acts as a reality check on the business process model.
Archivists and records managers are never going to be able to drive
business processes, but they are able to contribute to process improvement.
Information is a valuable resource and any organisation will benefit from
using it more efficiently. Conversely, if valid records are not properly
managed, not only will an archival resource be lost to future historians,
but the organisation also faces a number of risks including failure to
produce evidence that it has fulfilled its obligations, loss of proof of
ownership or rights, inability to find information required for current
activities, loss of efficiency if this information is difficult to find,
and potential high costs of 'rescuing' it if it is not in a condition to
be accessed directly. A good information strategy will ensure that when
information is needed the user has access to a single authorised source
that is easy to find and is maintained for as long as it is needed, but
no longer. If access to the information needs to be restricted, appropriate
controls can be put in place, but otherwise the whole organisation will
benefit from sharing and exploiting it efficiently. This lies at the heart
of the Knowledge Management that is currently in vogue, and also has an
important role to play in Quality Assurance procedures. It is not only
big business that is taking this on board, for example, the Joint Information
Systems Committee (JISC) promotes the innovative application and use of
information systems and information technology in higher and further education
across the UK
[20].
Best practice
A number of projects are underway which address various aspects of the
electronic information challenge. There have been efforts to agree acceptable
practices in the creation of digital material, and these guidelines will
hopefully become even more widely observed. These usually include the observance
of appropriate standards, the use of open, non-proprietary data formats,
provision of documentation and metadata in accordance with emerging standards,
and assigning permanent names to online digital resources.
There is not yet any one accepted solution to long-term preservation,
probably there never will be given the rapidly changing environment and
the differing needs of different organisations, but some common approaches
are emerging. One example is the Open Archival Information System (OAIS)
model described below. Of the various practical implementations of electronic
recordkeeping I have chosen to describe briefly that of the Public Record
Office Victoria, Australia. It is not necessarily typical of models followed
elsewhere, but it is a very good, practical working example.
Open Archival Information System OAIS [21]
An initiative by NASA's Consultative Committee for Space Data Systems has led to the development of a Reference Model for an Open Archival Information System (OAIS) that is now being reviewed as a draft International Standardization Organization (ISO) standard and is expected to become a fully fledged standard in future [22] . It addresses the issue of what constitutes a genuine archive in the electronic era (as opposed to the sort of ‘archiving’ that really equates to moving data offline). But it is not limited to traditional archives, indeed it has more following in the library community where it has influenced (or been adopted by) a number of digital initiatives such as CEDARS, PANDORA and NEDLIB. By their definition an OAIS is an archive (consisting of an organisation of people and systems) that has accepted the responsibility to preserve information and make it available for a designated community. It lists the mandatory responsibilities of an OAIS archive as follows, it must:
Victorian Electronic Records Strategy (VERS) [23]
The report, Keeping Electronic Records Forever commissioned by Public Record Office Victoria (PROV) in 1995-96, advocates that instead of taking a system oriented approach to electronic records, a data driven approach is more appropriate as the records will outlast any system developed to manage them. In order to achieve this the VERS project team first examined how records were created within Government and how they were used and archived, and made this explicit via process maps. They also looked at developments elsewhere in the world. They concluded that an electronic record had to be a fully self-documenting object, and chose to describe these objects in XML. They also determined that an electronic record was made up of one or more documents, contextual information relating this record with other records, and evidential integrity checks. They then built an archival system prototype and a retrieval system prototype, and created several sample working environments. They adopted a standard record format, the 'layered' (onion) model which encapsulates the documents, the context, and authentication in a single object, and they also adopted standard metadata sets (based on the Pittsburgh project model). The Final Report details the VERS project's findings and includes a general description of their prototype system, functional descriptions of electronic archiving, details of the VERS long term electronic record format, the metadata schema used by the project and costings of possible VERS compliant system implementations.
Conclusion
Electronic records are just records. Archivists have long agreed that they should not be accorded greater or lesser importance because of their format. However, their format does require a change of approach. The policy may not change but its implementation does. Established principles of archives and records management can form a sound basis for new approaches, for example by clarifying the essential requirements of a genuine record, but they should not be allowed to limit the view. Archivists urgently need to learn how to work effectively with their fellow professionals who now share the responsibility for the preservation of records. In future good recordkeeping needs to be built into systems, not applied retrospectively at the end of a record's life-cycle, and archivists and records managers alone are unlikely to achieve this. Positive action is needed, but sometimes we all need a little encouragement that it is achievable. Help is at hand! Our professional societies and training bodies offer guidelines. Most of the projects I have mentioned here have excellent Web pages, which include bibliographies citing many other related studies and projects [24]. Standards are emerging and many practical implementations are helping to establish and share best practice. The Victorian Electronic Records Strategy offers the following heartening words: "As a result of this project we have concluded that the:
[2] Margaret Hedstrom and Sheon Montgomery
Digital Preservation Needs and Requirements in RLG Member Institutions"
(December 1998)
URL:<http://www.rlg.org/preserv/digpres.html>
For more general information, see URL:<http://www.rlg.org/longterm/>
[3] International Council on Archives. Guide For Managing Electronic Records From an Archival Perspective Committee on Electronic Records, ICA Studies/Etudes CIA 8, (February 1997)
[4] Committee on the Records of Government
(1985) Report. Reprinted, Malabar, Florida: Robert E. Kieger
Publishing Company, Inc. (1988).
[5] URL:<http://www.lis.pitt.edu/~nhprc/>
[6] As part of a project to deal with its
own electronic records, the Indiana University Electronic Records Project
developed a methodology to evaluate information systems against the University
of Pittsburgh Functional Requirements for Evidence in Recordkeeping. Several
phases and tasks for the methodology have been identified. These include:
1) a description of the business through functional decomposition and
identification of transactions and
associated evidence,
2) a description of the associated information system in terms of the
identified transactions,
3) evaluation of the information system against the Functional Requirements
in the context of the identified
transactions of the business, and
4) recommendations for intervention to satisfy the Functional Requirements.
URL:<http://www.indiana.edu/~libarche/index.html>
[7] URL:<http://www.interpares.org/UBCProject/ >
[8] Design Criteria Standard for Electronic
Records Management Software Applications. DOD 5015.2-STD
(Department of Defense, USA). This standard is endorsed by NARA (National
Archives and Records Administration)
and is the first example of a federal agency developing formal criteria
for electronic records management.
In addition to the standard, a software test procedure has been developed
along with a register of records management software applications products
that have passed the test.
URL:<http://jitc.fhu.disa.mil/recmgt/index.htm>
[9] URL:<http://www.interpares.org/>
[10] See, for example: Preserving Access
to Digital Information (PADI) <http://www.nla.gov.au/padi/>
[11] Conservation OnLine (CoOL), produced
by the Preservation Department of Stanford University Libraries, is a full
text library of conservation information covering the conservation of library,
archives and museum materials. It includes information on electronic media
and records.
URL:<http://palimpsest.stanford.edu/>
See also Standards Published on Permanence of Electronic Materials
URL:<http://www.cd-info.com/CDIC/Industry/news/ansi.html>
[12] Jean Marie Deken "Electronic Recordkeeping:
An Introduction" Invited talk presented at the US Department of Energy
Records Management Conference (5/17/99 - 5/20/99)
URL:<http://www.slac.stanford.edu/~jmdeken/papers/SLAC-PUB8152.html>
[13] Neil Beagrie and Maggie Jones Preservation
Management of Digital Materials Workbook: a pre-publication draft (October
2000)
URL:<http://www.jisc.ac.uk/dner/preservation/workbook/>
[14] For example, The British Standards Institute's Code of practice on legal admissibility and evidential weight of information stored electronically, BSI DISC PD0008:1999
[15] See, for example: G. O'Shea "Keeping
electronic records: issues and strategies" Provenance: the electronic
magazine, Vol 1, No. 2 (1996)
URL:<http://www.netpac.com/provenance/>
[16] Barbara Reed, "Appraisal and disposal", in Judith Ellis (ed.), Keeping Archives, 2nd edition (Port Melbourne, 1993)
[17] This is a multidisciplinary forum on
the problems of the management, storage, conservation and retrieval of
machine-readable data, organised by the European Commission. It takes
quite a high-level view to solving
problems, formulating a "DLM-message" to the Information and Communication
Technology Industry,
and trying to influence funding of projects. But its 1996 conference
also included some interesting case studies such as the description by
David Bowen of Pfizer's Central Electronic Archive.
URL:<http://europa.eu.int/ISPO/dlm/>
[18] "Australian Standard AS 4390–1996, Records
Management" provides a methodological framework for organisations to develop
recordkeeping strategies that satisfy business needs, accountability requirements
and community expectations. The Designing and Implementing Recordkeeping
Systems (DIRKS) methodology is an eight-step process which agencies can
use to design and implement AS 4390 compliant recordkeeping systems.
URL:<http://www.naa.gov.au/recordkeeping/dirks/summary.html>
[19] See, for example:
Jeff Morelli "Process-driven retention scheduling" Records Management
Bulletin Issue 94 (1999);
G. O'Shea "Keeping electronic records: issues and strategies"
Provenance: the electronic magazine, Vol 1, No. 2 (1996) URL:<http://www.netpac.com/provenance/>;
and
Tom Ruller URL:<http://www.truller.com/>
[20] URL:<http://www.jisc.ac.uk/>
[21] OAIS Consultative Committee for Space
Data Systems "Reference Model for an Open Archival Information System (OAIS)"
CCSDS 650.0-R-l. Red Book (May 1999)
URL:<http://ftp.ccsds.org/ccsds/documents/pdf/CCSDS-650.0-R-1.pdf>
[22] For a good overview of this initiative,
see: Brian Lavoie "Meeting the challenges of digital preservation:
the OAIS reference model". OCLC Newsletter. No. 243. (January/February
2000)
URL:<http://www2.oclc.org/oclc/pdf/news243.pdf>
[23] URL:<http://www.prov.vic.gov.au/vers/welcome.htm>
[24] See also, URL:<http://www.records.nsw.gov.au/publicsector/erk/websites/websiteguide.htm>
[25] The Victorian Electronic Records
Strategy Final Report (31 March 1999)
URL:<http://www.prov.vic.gov.au/vers/final.htm>
E-mail: Anita.Hollier@cern.ch
|