|High Energy Physics Libraries Webzine|
Issue 9 / February 2004
Cathrine Harboe-Ree, Andrew Treloar (*)
There is a growing interest among academic institutions in managing institutional digital content produced in and for research, teaching and learning. This article argues for an integrated approach to this activity and examines the role of libraries in facilitating this. It then considers some existing approaches to providing software to support this. Australian initiatives consistent with this integrated model are then discussed.
There is a growing interest among academic institutions in collecting, preserving, reusing and creating value-added services from digital content produced in and for research, teaching and learning. The emphasis on research outputs and collaboration, and distance, flexible and online learning, together with developments in information technology, has led to an increased awareness that the digital content being created by members of the academic community is an institutional asset. This content is also increasingly being recognised as an institutional challenge, requiring both tactical management and a strategic response.
At the same time many academic libraries are responding to the challenges of new technologies by taking the opportunity to redefine their fundamental role in the creation, distribution and provision of access to information. Over the past decade libraries have moved almost completely towards a digital platform for management of the information (both print and electronic) that they acquire or subscribe to. They have built significant digital collections of material published by others, and they are increasingly producing new content themselves . Often this content originates from, or is the intellectual property of, their own institutions.
Meanwhile, all around the world, universities, their libraries, faculties, research centres and information technology and course development units, are trying to cope with the digital revolution. There is a growing recognition and articulation of the convergence that is occurring among the various digital initiatives in which universities are engaged, and the opportunities for potential synergies and more significant outcomes through collaboration .
Neil McLean  stresses the need for this growing convergence to be viewed from a service perspective, rather than a delivery perspective. He argues that no online learning or research environment can be successful without relatively seamless access to information resources at the point of need, and says further that the challenge remains for the various service providers to find a balance between systems support, "learning containers", information resources and sound pedagogical principles.
The COLIS (the Collaborative Online Learning and Information Services model ) work at Macquarie University has focused on testing the feasibility of interoperable standards as a way of managing interactions between a range of electronic services. Through the success of the COLIS model, McLean and others have demonstrated that the new electronic environment can and must comprise a complex interactive matrix that is dependent on the information resources mentioned above, as well as on user directories, content and rights management software, and metadata repositories.
Sally A Rogers, from Ohio State University, argues that the full array of a university's digital assets and information services should be broadly defined, and should include the library's catalogue, the electronic journals, reference databases and other electronic resources available through the library, as well as institutional repositories and resources created or collated elsewhere in the university, such as course material . She notes the overlapping of such initiatives as digital collections, course web sites, electronic course packs and learning objects, the desirability of integration to search across these repositories and the development of standards to promote interoperability. Rogers also highlights the potential of increased interoperability and connectivity to generate innovation in research, teaching and learning.
The November 2002 report of the Higher Education Information Infrastructure Advisory Committee (HEIIAC) of the Australian Government Department of Education, Science and Training (DEST)  identified the following critical features of an enhanced research infrastructure:
The HEIIAC report was primarily concerned with managing the current problems associated with scholarly communication and publishing, and it stressed the need to adopt a national collaborative approach. Neil McLean, Ohio State University and others embrace scholarly communication strategies and then argue that they should be incorporated into a more holistic approach to the management of institutional digital content and intellectual capital.
The merging of these two approaches would yield substantial benefits to Australian university communities, consistent with the following statements of principle:
These statements reflect the HEIIAC objectives and place them into a framework that, if implemented, would improve institutional and national efficiency and effectiveness. They are, of course, relevant in a non-Australian context also.
In this context, the move by academic libraries to establish e-print repositories and e-publishing capability should be seen as consistent with an environment that already contains a complex and converging suite of intellectual property sources required to support the contemporary research and learning environment.
Staff at Ohio State University have coined the phrase Knowledge Bank to refer to this complex environment . In developing its Knowledge Bank concept Ohio State University identified a number of expected benefits, which include:
There are many barriers to the adoption of an integrated approach, not least the size of the task, the lack of integration software, the cost and challenge of undertaking such a task in its entirety, and the confusing plethora of possible approaches. Also to be overcome in many instances are significant cultural differences, especially within institutions, and the apparent durability and acceptance of current patterns of information creation and dissemination. Notwithstanding, there are significant risks in not adopting this approach, including the cost of not managing information effectively and without duplication of effort, and the potential alienation of end users.
In thinking about how to take on such an integrated approach, it is important to consider where it should best be housed. Libraries are natural, although not exclusive, information management leaders within their universities, based on their traditional print and growing digital content management expertise. This is not to say that academic libraries can or should assume responsibility for overall information management within their institutions. However, it does suggest that they should be well positioned to exercise that leadership role, either as advisors, managers or practitioners.
Many Australian university libraries have started the process of redefining their role, but rarely to the extent of their American and European counterparts. We (that is, Australian university libraries) need to examine this issue to determine whether or not we want to redefine our role, and, if so, what strategies we need to put in place. We need to examine the issue of the resources required to broaden or fundamentally change our role, noting that many of the most innovative changes overseas have occurred as the direct result of significant government or philanthropic grants, building on an underlying base quite a lot stronger than that enjoyed by most Australian university libraries.
It is also important to examine the internal, intra-university and national cultural and management constraints. Libraries are only one of the groups within a university that may see themselves as partial owners of this territory. Information Technology groups may claim technical expertise, and Learning and Teaching groups may see that any collaborative endeavour should be primarily driven by pedagogical considerations. The challenge is to avoid on the one hand "the horse designed by a committee" and, on the other, the "tragedy of the commons". All the key players need to be involved but no one player should dominate.
There is a question, though, about how well equipped Australian university libraries actually are to maintain or extend their leadership role to the advantage of their institutions. The barriers to this include:
One of the most pressing issues we as libraries have to deal with, perhaps surprisingly, is a lack of relevant technical expertise. I say "perhaps surprisingly", because our natural strength is our ability to effectively manage large and complex information repositories. It can be argued that libraries have a number of areas of weakness technically that are currently preventing them from breaking through into a wider information management role. Libraries need to urgently develop their expertise in the areas of XML (Extensible Markup Language) and metadata, both of which are emerging as essential to the storage, use, integration, dissemination and preservation of information, and other wider information tools such as content management systems.
Libraries have stronger metadata than XML expertise, but this expertise is limited to a smaller than required number of staff, given the important place metadata has in the management of web-based information. To play a leadership role within their institutions, libraries must increase the number of staff with metadata expertise and the range of schemata to which this expertise can be applied.
Libraries can adopt a leadership role, and can overcome their technical deficiencies, but they cannot proceed in isolation and must therefore address the range of barriers to change or the adoption of a collaborative and integrated approach to the management of digital content, for whatever purpose, referred to in this paper.
So what are some areas where libraries are taking on this leardership role? One of the areas which has seen significant work in the last two years has been that of digital institutional repositories. These are defined by Clifford Lynch as:
"A university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution." 
A number of groups are now working to provide the necessary infrastructure on which to build this set of services. The two best known pieces of software are DSpace and FEDORA, but there are a number of other contenders.
DSpace  is a joint activity between MIT Libraries and Hewlett-Packard to jointly develop a software system to enables institutions to:
It is being made available under the BSD open source license to other groups to run as-is, or to modify and extend as needed.
The current version of DSpace (1.1.1) can best be thought of as a general-purpose repository application, with a series of both hard-wired and preferred behaviours. It is designed to provide stable long-term storage needed to house the digital products of MIT faculty and researchers. DSpace is intended to have different advantages for different stakeholder groups:
"For the user: DSpace enables easy remote access and the ability to read and search DSpace items from one location: the World Wide Web.
For the contributor: DSpace offers the advantages of digital distribution and long-term preservation for a variety of formats including text, audio, video, images, datasets and more. Authors can store their digital works in collections that are maintained by MIT communities.
For the institution: DSpace offers the opportunity to provide access to all the research of the institution through one interface. The repository is organized to accommodate the varying policy and workflow issues inherent in a multi-disciplinary environment. Submission workflow and access policies can be customized to adhere closely to each community's needs." 
While DSpace grew out of the needs of MIT, a group of North American and European universities are now participating in the DSpace Federation , which will test the existing software, and offer suggestions about how to further develop and improve it.
DSpace supports a wide range of content types , and particular installations can easily extend the range available.
FEDORA is both a software platform and an architecture (actually the Flexible Extensible Digital Object and Repository Architecture). The architecture came out of Digital Library work done in the computer science field in the late 1990s . The software platform came out of Mellon funding to the University of Virginia and Cornell University to provide an production-quality instantiation of this architecture. Increasingly, the term FEDORA (which was first used over 5 years ago as an acronym for the architecture) is now being used to refer to this software.
In this latter sense, FEDORA is "an open source, digital object repository system using public APIs exposed as web services." . FEDORA can best be thought of as services-mediation infrastructure, rather than an off-the-shelf application. It can use web services to call other services as well as expose its own services using web services standards. Key to the FEDORA architecture (yes, I know this is like referring to an ATM Machine...) is its underlying object-based model. FEDORA stores digital content objects, either as datastreams contained within the repository or as links to external resources. It also stores disseminators, which are ways to render these digital content objects. The software maintains bindings between content objects and their disseminators. Each object has a default disseminator, but may be able to be disseminated in other ways. This architecture is extremely flexible, and provides significant advantages as a platform on which to build other applications.
Version 1.2 of FEDORA, released in late December 2004, provides versioning of both objects and their disseminators, as well as a Java-based Administration GUI.
There is also a range of other open-source repository projects underway. The Soros Institute is currently maintaining a document which summarises the functionality of many of them . In addition to DSpace, the current version also reviews FEDORA, CDSWare, MyCoRe, i-Tor, eprints.org and ARNO. These each come out of particular responses to the challenges of managing large amounts of digital content, and each have their own strengths and weaknesses.
The guidelines for submissions identified the following requirements to be met by successful bids:
In response to this call, 14 projects were submitted of which four were funded . The successful projects were:
These four projects were funded for a combined total of A$ 12 million over a period of 3 years, with funding commencing at the start of 2004 .
This redevelopment of the ADT metadata repository is an essential adjunct to existing institutional initiatives by providing efficient search services and metadata support services. A partnership with ProQuest Information and Learning, an approach being taken by a number of North American universities, provides opportunities for retrospective digitisation of Australian higher degree theses. If successful, this will enable faster progress towards a critical mass of online content.
The lead institution for this project is the University of New South Wales. Partners in this initiative are Curtin University of Technology; University of Melbourne, University of Queensland; Sydney University of Technology; ProQuest Information and Learning.
The ARROW project  will identify and test a software solution or solutions to support best-practice institutional digital repositories. The ARROW model architecture is shown in the figure below.
The common repository layer will almost certainly be an extant piece of institutional repository software. One of the early project tasks will be to identify this software from the range already discussed.
On top of this storage layer, the project will build/integrate software to provide functionality to handle electronic theses, e-prints, and open-access electronic publishing. A wide range of digital content types will be managed via these modules. In addition, The National Library of Australia (NLA) will develop a repository and associated metadata to support independent scholars (those not associated with institutions). The DEST Research Directory module will enable institutions to better manage returns to DEST about research output for the year (similar to the Research Assessment Exercise (RAE) in the UK). Finally, ARROW is open to the possibility of other categories of content later.
The access layer activity of ARROW is the development and testing of national resource discovery services (developed by the NLA) using metadata harvested from the institutional repositories, and the exposing of metadata to provide services via protocols and toolkits. This will also include a potential path for the redevelopment of the Australian Digital Theses (ADT) metadata repository incorporated into the NLA's national resources discovery services.
Initially ARROW will be tested in the four partner institutions, prior to it being offered more widely across the higher-education sector. The solution will be open-standards based, or will support open standards, and will facilitate interoperability within and between participating institutions.
The lead institution for this project is Monash University. Partners are the University of NSW; Swinburne University of Technology, and the National Library of Australia.
The lead institution for this project is the Australian National University. APSR partners are the National Library of Australia; University of Queensland; University of Sydney; the Australian Partnership for Advanced Computing.
The lead institution for this project is Macquarie University, the home of the COLIS initiative referred to earlier. Partners in this activity are the National Library of Australia; Education.au; Telstra; University of Southern Queensland; University of New England; University of Tasmania; University of Newcastle; University of Western Australia; Curtin University of Technology; Edith Cowan University; Murdoch University; Notre Dame University; Internet2/MACE Shibboleth.
The projects listed above are all focused on research activities and outputs. This is, in part, because the government funding for these activities was tied to the research sector. But it is also, I suspect, because this is seen as the most important part of the university and therefore one that should benefit first from an integrated approach.
Moving forward, Australian universities will need to recognize that they operate in three distinct spheres (Research, Teaching and Learning, and Community Engagement/Cultural Activities). In addition, nearly half of all university personnel work in the Administrative (or support services) sector. This has its own distinct information needs.
The challenge in Australia, and elsewhere, will be to extend this integrated digital repository approach to these other sectors of endeavour, adapting it and extending it as required along the way. The ARROW project has already identified this as an area of future activity. It is unclear exactly what this will mean, but the journey will certainly be both challenging and interesting.
 Harboe-Ree, C., Sabto, M. and Treloar, A., "The library as digitorium: new modes of creation, distribution and access", Proceedings of VALA 2004, Melbourne, February (in press).
 Rogers, S.A., "Developing an institutional Knowledge Bank at Ohio State University: from Concept to Action Plan", in portal: Libraries and the Academy, January 2003. URL: http://www.lib.ohio-state.edu/Lib_Info/rogersKBdoc.pdf
 McLean, N., Libraries and e-learning: Organisational and Technical Interoperability. URL: http://www.colis.mq.edu.au/news_archives/demo/docs/lib_e_learning.pdf
 Australian Commonwealth Department of Education, Science and Training, Research Information Infrastructure Framework for Australian Higher Education. The Final Report of the Higher Education Information Infrastructure Advisory Committee (Systemic Infrastructure Initiative). November 2002. URL: http://www.dest.gov.au/highered/otherpub/heiiac/exec_summary.htm
 Payette, Sandra & Staples, Thornton, "The Mellon Fedora Project: digital library architecture meets XML and web services", Sixth European Conference on Research and Advanced Technology for Digital Libraries. Lecture notes in computer science, vol. 2459. Springer-Verlag, Berlin Heidelberg New York (2002) 406-421. URL: http://www.fedora.info/documents/ecdl2002final.pdf
 Staples, Thornton, Wayland, Ross & Payette, Sandra, "The Fedora Project: an open-source digital object repository management system", in D-lib Magazine, April 2003. URL: http://dlib.org/dlib/april03/staples/04staples.htm
 DEST, Information Infrastructure - Call for Proposals 2003. URL: http://www.dest.gov.au/highered/research/proposal.htm#1
 DEST, Information Infrastructure - Outcomes of Selections Process. URL: http://www.dest.gov.au/highered/research/outcomes2003.htm
Tel: +61 3 990 52665
Address: Matheson Library, Monash University, Clayton Campus,
Wellington Rd, Clayton VIC 3168 Australia
Dr Andrew Treloar
Project Manager (Strategic Information Initiatives), ARROW Technical Architect, and Adjunct Librarian, Monash University: http://www.monash.edu.au/
Tel: +61 3 990 51138
Address: Building 3A, Monash University, Clayton Campus,
Wellington Rd, Clayton VIC 3168 Australia
Andrew Treloar tries to think strategically about information management, information policy, enterprise architectures, digital repositories, and electronic publishing. As a result, he never gets enough time for reading or working in his vegetable garden.
Cathrine Harboe-Ree and Andrew Treloar "Connecting the Dots Downunder: Towards An Integrated Institutional Approach To
Digital Content Management", High Energy
Physics Libraries Webzine, issue 9, March 2004.
If you have any comments on this article, please contact