![]() |
|
HEP Libraries Webzine
Issue 11 / August 2005
01/07/2005
New information technologies have had many influences on the information world. E-resources (electronic journals, databases, web-resources and e-books) are present more and more in libraries offering users alternative access points to information. In this digital environment, licensed electronic information is increasingly the dominant source of content and because of it, librarians have lost direct control over their collection and must rely on aggregators and publishers for the supply of collection usage measurements.
E-journals have been present in the CERN Library catalogue since 1998. Today, the collection of e-journals (including online-only and those having a print counterpart) comprises about 1517 titles (Source CDS). These resources have been catalogued, displayed and promoted and an acquisition and management policy has been implemented [1]. The existence of e-journals in the CERN Library during the last ten years calls for some reflection on their impact and potential.
E-journals are generally just as expensive as printed journals and the number of titles available electronically and the add-on services continually increase. For research institutions and their libraries, there is not only the question of balancing a limited overall budget but also of how the research budgets can best be used. In fact, research institutions are confronted with having to pay for both the research itself and for the published results through journal subscriptions. As Dallman explains:
“Concerning journal subscriptions, we find the situation for some particle physics core journals (such as Physics Letters B) rather bizarre. The community pays large sums of money to build the accelerators and carry out the experiments, and the editors and referees are also researchers in the field who carry out these tasks in their research time. But then we are asked to pay more and more to buy back the results we have produced, without which there would be no journal to publish in the first place” [2]
In view of the problems of those institutions which effectively have to pay twice and of the possibility of a solution that alternative (open access) electronic publishing models might offer, we need some clarity of the situation in order to take future decisions about publishing scientific articles. And, these decisions should be informed by knowledge of the use of e-journals. Several studies on the integration of e-resources into library environments and the creation of electronic libraries can be found but at present, the use of e-journals still poses more questions than there are answers, particularly concerning preservation, distribution and commercialization. In addition, we do not really know how to effectively measure usage.
Electronic technology provides an opportunity for improving journal usage measurement at least for e-journals. Measurements from the vendor's web site are possible and librarians and publishers could profit from a better understanding of the use of electronic information to feed into their management decisions. Indicators could be used in libraries for cancellation and purchase decisions, learning more about users habits (how they use resources, identification of areas where training might be appropriate, file format preferences e.g. PDF or HTML), and evaluation of performance [3]. Publishers use statistical data to aid their editorial decision-making and to see the usage via routes/interfaces [4].
Usage statistics for e-journals are sent from most publishers to purchasing libraries or consortia. The formats and indicators of these statistical reports are based on the ICOLC (International Coalition of Library Consortia) guidelines [5] or on COUNTER (Counting Online Usage of NeTworked Electronic Resources) recommendations [6]. ICOLC’s Guidelines for statistical measures of usage of web based information, published in 1998, define the indicators that should be provided, and they invite a commitment from information providers to supply reports to libraries. They described the need for the library community to measure usage of the new resources. As an answer to this appeal information suppliers started to provide usage statistics to libraries.
In 2002, the COUNTER Code of Practice was launched, developed by publishers and librarians. This Code of Practice is an attempt to standardize usage statistics reports according to different levels of accomplishment. For achieving level 1, publishers have to provide reports that include the number of full-text articles requested by journal and by month as the main indicator. The choice of this indicator over others is based on the idea that journal users are mainly interested in reading articles. The draft of Release 2 of the COUNTER Code of Practice [7] continues to focus on the measurement of full-text requests but adds to level 1 a differentiation between full-text versions in PDF and HTML formats.
Publishers started to deliver usage statistics reports including data corresponding to the indicators mentioned by ICOLC, for example tables of contents requests, abstract requests, total sessions, article downloads, turn-aways, searches, etc. and libraries quickly became aware of the difficulty of comparing data from different vendors.
However, a lack of standardization amongst the data offered and of the metric definitions are the main difficulties for comparison and analysis. The result is that today, some publishers only achieve COUNTER Code of Practice level 1, standardizing their reports but reducing their indicators to only two.
Statistics are therefore still not consistent. The absence of tools to analyze and compare the data is another gap. We do not think that this situation should stop analysis, but that finding tangible examples to study is a way to go forward.
We would like to contribute to the study of usage statistics of e-journals and so have looked at some new methods for doing so. Our main goal is to understand the use of the CERN e-journals collection. In order to have an accurate analysis we will take into consideration:
The article is divided into three parts. The first section presents the context: CERN and its Library e-journals collection. The second section presents the ICOLC, COUNTER and publishers’ reports. The third section presents usage statistics and the different approaches for comparing and analyzing data.
We will not explore previous e-metrics research in this paper. There are various studies and projects on this subject, for example, the e-metrics project of the Association of Research Libraries (ARL) [8] which began in 2000 and other efforts in e-metrics (e.g. [9], [10], [11], [12], [13], [14], [15]). However, what is clear is that there is no certitude about what is measured by the action of a click, a browse or a print. Without more precision on this point it is difficult to draw conclusions.
CERN is the European Laboratory for Particle Physics located in Geneva. It provides scientific research facilities to some 3000 permanent staff and 6500 visiting scientists. It is the world's largest particle physics centre, used by about half of the world's particle physicists. The users represent 500 universities from over 80 countries. The CERN staff design and build CERN's intricate machinery and ensure its smooth operation. They help prepare, run, analyze and interpret the complex scientific experiments and carry out the variety of tasks required to make such a special organization successful. Constructing these highly advanced machines is an extremely costly operation. CERN publication policy states that all results from CERN experiments and research are made public. Most of the scientific papers written by CERN collaborators are available through the library or the CDS (CERN Document Server) [16].
The CERN Library offers an e-collection of preprints, published articles, theses, annual reports, conference proceedings, databases, Internet resources, e-journals and e-books. The CERN Document Server (CDS) hosts e-versions of full-text article preprints and links preprint records with the published version if a CERN subscription to access that journal exists. The acquisition policy is elaborated by the Working Group for Acquisitions and the Scientific Information Policy Board.
The CERN Scientific Information Service has been active in the field of digital library research and in providing scientific information services to the high-energy physics community for almost five decades now. The research focus has been on interoperability issues for document storage and retrieval systems, metadata added-value services, digital library automation and networked information services [17], [18].
The periodicals catalogue contains separate records for printed journals and for electronic versions with more than twice as many of the latter as of the former. The e-journals are either online versions of printed titles or electronic-only journals (listed at http://cdsweb.cern.ch/?c=Periodicals&as=0&ln=en and http://library.cern.ch/electronic_journals/ej.html respectively). The subjects covered are high energy physics and related subjects such as astronomy, mathematics, physics in general, computer science, electronics, etc. Online journal subscriptions are obtained through site or consortium licenses. Since 2002 the CERN Library has been a member of the Swiss Consortium of Academic and Research Libraries.
The ICOLC ‘Guidelines for Statistical Measures of Usage of Web-Based Indexed, Abstracted, and Full Text Resources’ [5] ask for six types of information to be produced on a regular basis :
The COUNTER Code of Practice for journals [6], [7] establishes recommendations for two levels of accomplishment and for the delivery of four reports containing:
Under the COUNTER Code of Practice publishers are required to deliver at least the level 1 data referring to full-text requested and turn-aways. For publishers and vendors, these are the indicators of primary concern.
Publishers started to deliver usage statistics reports including data corresponding to the indicators mentioned by ICOLC, for example: requests for tables of contents, abstracts, sessions, article downloads, searches and turn-aways. The multiplicity of data offered was the first difficulty for the comparison of the reports. Later, publishers moved to the COUNTER Code of Practice preferring to offer level 1 reports (reports 1 and 2 and sometimes report 3). By following the COUNTER Code of Practice, some publishers could simplify their statistical reporting but at the same time, unfortunately for libraries, reduce the number of indicators. Most publishers deliver reports on a monthly basis, but their interfaces and report formats vary from one to another. Table 1 summarizes the indicators offered by some publishers for each journal title. We note that all publishers offer the indicator of total full-text requested. These data give us the possibility of answering questions such as: is there a relationship between journal usage and subscription price; what is the relationship between impact factor, price and usage? But the data concerning more specifically the use of the publishers’ interfaces and services, such as the number of sessions, the entry pages and exit pages, are not given for any journal title.
Table 1: Data availability by journal and publishers compliant with the COUNTER guidelines
Notes:
i) Searches by database and for ScienceDirect
ii) Not available for CERN
iii) For all collections
iv) PDF / HTML
v) By journal
The following problems came up when studying our local usage statistics:
In order to understand the use of the CERN e-journals collection, we use three different approaches to the statistics:
a) Journal elements usage: comparison of the use of four journal elements for a sample group of subscribed journals — table of contents (TOC), abstracts, search and full-text articles requested.
b) Journal usage comparison using a single element: articles (full-text requested or downloaded)
i. Comparing journal usage by publisher
ii. Comparing journal usage from a single publisher over a 12-month periodc) Article usage within a single journal:
i. Comparing article usage and CERN publishing in one journal
ii. Comparing article usage, total articles and CERN articles in the core HEP journal collection
When we started the study we were trying to identify and to analyze the use of e-journals. We compared four indicators: table of contents, abstracts, search and full-text articles requested. It is important to know that in the CERN e-journals catalogue, there are direct links to these elements.
For our study we have analyzed 12 months of statistics for the AIP (American Institute of Physics) journals in the CERN collection. We chose those journals because AIP offers all four indicators.
Figure 1: Journal elements usage: AIP
Figure 1 shows that:
This information is interesting because it shows the users’ behaviour.
It could be useful for the design of user interfaces or to select new e-services
mentioned in the catalogue but not subscribed — usage of tables of contents
and abstracts of unsubscribed titles is used by some libraries for justifying
new purchases [4]. However, this approach is limited; it shows how an e-journal is used but
it does not answer the question of what is used within the journal and when.
As we can see in Table 1, the number of full-text articles requested is the only indicator offered by all publishers, so we decided to follow our research using this data.
Comparing journal usage by publisher
Our idea was to compare the e-journals collection by publisher over a determined period of time, in order to have a global view of this usage.
Figure 2: CERN e-journals collection — 12 months usage
Comparing journal usage by publisher allows us to compare one year’s activity with the previous year’s activity. This figure is also useful for license negotiations by a library or a consortium. These negotiations benefit from having information on how much a publisher’s collection is used or not. In the case of consortia it is difficult to try to compare usage across different libraries because each library collection and its use is determined by the library’s context (needs, resources and users).
For CERN, this approach is limited because publishers produce different numbers of relevant titles for our scientific community. We subscribe to more titles (and thus have more usage) from one specific publisher than from another. This result is also influenced by access through the consortium to titles we might not otherwise be able to afford and by the level of digitalization of back files offered by the publishers.
Comparing journal usage from one publisher over a 12-month period
As noted above, comparing journal usage by publisher shows roughly what is used in the CERN e-journals collection, but our knowledge about which journals are used is still vague. To investigate further we studied journal usage from one publisher over a 12-month period. Figure 3 shows Kluwer (now Springer) journal titles accessed by consortium license, some of which are also subscribed to in print. There are again obvious limitations to this approach as different journals cannot be compared, we cannot compare a specialized quarterly with a weekly general subject journal.
However, from figure 3 we note:
Figure 3: full-text downloads of Kluwer (now Springer) journal titles
1) The case of Space Science Reviews
CERN Library has never subscribed to Space Science Reviews but CERN
users can access the online version through the Swiss Consortium of Academic
and Research Libraries license. The large amount of full-text use registered
in a short time period (figure 4) raises interest.
This kind of use at CERN could correspond to the activity of a particular
experiment taking place in the Laboratory. In order to investigate this hypothesis
we would need information about the IP addresses that have accessed and downloaded
the articles, but unfortunately this information is not given by the publishers
because of the protection of privacy. This concern over privacy represents
an obstacle for knowing more about the needs and interests of the research
carried out locally. In any case, at CERN not all IP addresses are linked
to one specific user; many scientists visit CERN for short periods and the
IP addresses are linked to departments, units and collaborations.
Figure 4: full-text downloads of Space Science Reviews
2) No use of Space Science Reviews between September and
December 2002
As previously hypothesized, the use of this title may reflect the interest
of a particular group of scientists not always physically present on the
site or erratic research on a tangential subject to high energy physics, like space
science. Or it could reflect the publication of one or more
high energy physics articles in this journal.
3) Regular use of a subscribed title: Hyperfine Interactions.
Figure 5 shows the usage of titles also subscribed to in print.
Hyperfine Interactions has been held in print in the Library for many years and online access is now given through the consortium license. We clearly see a regular and high use of Hyperfine Interactions. A more detailed analysis at the article level might lead us to find an explanation of how this journal is used.
Figure 5: full-text downloads of Kluwer titles subscribed to in print
Comparing article usage and CERN publishing in one journal
In this part of our study, we connect the figures concerning full-text downloads
from Hyperfine Interactions to the total number of articles published by
CERN authors over the same period of time in this journal.
Hyperfine Interactions is a journal devoted to research in the border regions
of solid-state physics, atomic physics and nuclear physics and relevant areas
of chemistry. CERN scientists from one particular experiment (ISOLDE) have
published articles in this journal.
Kluwer offered statistics on the most-used articles each month. The articles were presented by URL. We selected all the URLs representing articles in Hyperfine Interactions and then we compared the requests for full-text of articles by CERN authors with the requests for full-text of articles by non-CERN authors. The data relating to the author affiliation was locally generated.
Figure 6: Full-text requests from Hyperfine Reactions by each article’s
author affiliation
Comparing article usage and CERN publishing in a single journal gives us some interesting information on the deeper use of a journal. In figure 6, we note that CERN’s users download articles written both by CERN-affiliated authors as well as non-CERN-affiliated authors. However, as figure 7 shows, articles published in 2002-2003 by CERN authors represent only 4% of the total articles published in this journal but this represents 40% of the total downloads.
In the context of the open access debate and the new business models proposed by some publishers concerning an author or institution publication fee, we can ask how much the publication of those articles would cost for CERN and then compare this figure with the current subscription cost. The open access article fee of the Springer Open Choice model [19] is 3000 US dollars. An easy calculation shows us a very high cost for the sum of these articles compared with the subscription cost. We wonder how institutions will be able to handle and organize this payment.
Comparing article usage, total articles and CERN articles in the core HEP journal collection
Following our previous study we decided to use a similar approach on the core
high energy physics titles, the main subject area for CERN. We compared (by
journal and by 12-month period) the total number of full-texts requested with
the total number of
published articles in the journal and also the total number of published articles
by CERN authors. The latter two figures are not given by the publishers so
this data had to be generated locally.
Figure 8: Comparison of full-text downloads, total published articles and
articles published by CERN authors in core high energy physics titles
Figure 8 raises more questions. For certain journals, the total number of full-text downloads exceeds not only the number of articles by CERN authors, but also the total number of published articles. And the question behind the present study as to what is really used remains crucial and unanswered. We can formulate some hypotheses such as:
Unfortunately, we do not have the data that would confirm these hypotheses.
To answer these questions we would need more article-oriented data. Unfortunately
only one publisher, Kluwer, offered this information and after the merger with
Springer (a COUNTER level 1 compliant publisher), we fear that it will no longer
be provided:
“
Will the user statistics be merged and put on one platform? If so, when? As
part of the migration of all content to a single platform, we will also merge
all of our usage reports to a common format. We also expect this to be completed
by the end of 2004” [20].
- Statistics for e-journal usage are useful but the publishers’ reports are often insufficient for full analysis. Our approach would lead to different interpretations if applied to other libraries and contexts. Publishers’ reports do not satisfy the needs behind the present open access debate. New information products with costs based on the numbers of articles downloaded (IEEE Enterprise [21] and SCOPUS Elsevier [22]) reinforce an article-oriented approach more than a journal title approach. These data give only global information on titles and do not give value to articles.
- We would like publishers’ statistical reports to be improved at the detailed, article level and to provide useful information for management decisions, such as:
- How many times a specific article has been downloaded.
- Users’ actions regarding printing, sending and/or reading articles.
- IP addresses from which journals and articles are accessed.
[1] Eliane Chaney, Catherine Bulliard, Caroline Christiansen and Jean-Pierre Cressent. “Une Bibliothèque de Recherche Face à l'Édition Électronique” Bulletin des bibliothèques de France, 44:2 (1999): 27-32.
[2] David Peter Dallman. “Electronic Journals and
Electronic Publishing at CERN: a Case Study” in International
Spring School on the Digital Library and E-publishing for Science and Technology,
CERN, Geneva, Switzerland, 3 - 8 Mar 2002, p. 4.1-4.10.
Preprint available
URL: <http://cdsweb.cern.ch/search.py?recid=597899&ln=en>
[3] Linda S. Mercer."Measuring the Use and Value
of Electronic Journals and Books", Science and Technology Librarianship (Winter
2000).
URL:<http://www.library.ucsb.edu/istl/00-winter/article1.html>(visited
December 2, 2004).
[4] Simon Bevan and Louise Jones. “Using COUNTER Statistics: a Practical Perspective”, in: "Reports on briefing sessions and workshops held at the 27th UKSG Annual conference", Serials, 17:2 (July 2004): 169.
[5] International Coalition of Library Consortia
(ICOLC). Guidelines for Statistical Measures of Usage of Web-Based Indexed,
Abstracted, and Full Text Resources (November 1998).
URL:<http://www.library.yale.edu/consortia/webstats.html>(visited
December 2, 2004)
[6] COUNTER: Counting Online Usage of Networked Electronic
Resources. The COUNTER Code of Practice. Release 1: December 2002.
URL: http://www.projectcounter.org/code_practice.html#start
(visited December 2, 2004)
[7] COUNTER: Counting Online Usage of Networked Electronic
Resources. The COUNTER Code of Practice — Journals and Databases. Release
2, April 2005 (valid from 1 January 2006). (Draft published on the COUNTER
website in April
2004).
URL:<http://www.projectcounter.org/cop2.html>(visited
December 2, 2004)
[8] Association of Research Libraries. E-metrics: Measures for Electronic
Resources.
URL:<http://www.arl.org/stats/newmeas/emetrics/>
(visited December
2, 2004)
[9] International Organization for Standardization (ISO). NISO Z39.7-2004
Draft Standard for Trial Use American National Standard for Information Services
and Use: Metrics & Statistics for Libraries and Information Providers — Data
Dictionary. Approved: 6th October 2004.
URL:<http://www.niso.org/emetrics/universal_preview.cfm?id=1301&versionid=2>
[10] International Standards Organization, 2003. ISO/CD 2789 Information and Documentation: International Library Statistics.
[11] International Standards Organization, 2003. ISO/CD 11620/AMD.1 Information and Documentation: Library Performance Indicators.
[12] International Standards Organization (ISO), 2003. Technical Report 20983: Information and Documentation - Performance Indicators for Electronic Library Services.
[13] EQUINOX. Library Performance Measurement and
Quality Management System — 27th
November 1998 - 26th November 2000 (Project funded under the Telematics for
Libraries Programme of the European Commission).
URL:<http://equinox.dcu.ie/>
[14] JISC/Publishers' Association (PALS) Working Group
and Joint Working Parties. PALS Working Group. Home page
URL:<http://www.jisc.ac.uk/index.cfm?name=wg_pals_home>
[15] National Commission on Libraries & Information
Science (NCLIS). NCLIS Library Statistics Program (LSP) Home page
URL:<http://www.nclis.gov/statsurv/statist.html>
[16] CERN Document Server (CDS). Homepage
URL:<http://cdsweb.cern.ch/>
[17] Carmen O'Dell, David Peter Dallman, J. Vigen and Martin Vesely. “50
Years of Experience in Making Grey Literature Available: Matching the Expectations
of the Particle Physics Community” Publishing Research Quarterly, 20:1
(2004), p. 84-91.
Also in: 5th International Conference on Grey Literature: Grey Matters
in the World of Networked Information (GL 2003). Amsterdam, The Netherlands
, 4 - 5 Dec 2003: p. 117-123.
Also available as preprint: CERN-OPEN-2003-053. URL:<http://cdsweb.cern.ch/search.py?recid=690102&ln=en>
[18] J. Vigen. “Why Go Via Expensive Solutions, When You Can GoDirect?
: a Description of How CERN Library Integrates Digital Content”. In:
The Digital Library and e-Publishing for Science, Technology and Medicine :
practical course information ,CERN, Geneva, Switzerland , 13 - 18 Jun 2004.
Also available as preprint: CERN-OPEN-2004-018, 10p. URL:<http://cdsweb.cern.ch/search.py?recid=780090&ln=en>
[19] Springer Open Choice.
URL:<http://www.springeronline.com/sgw/cda/frontpage/0,11855,1-40359-0-0-0,00.html>
[20] Springer. In: Customer FAQ for the Springer-Kluwer
merger, Effective August 1, 2004, Springer.
URL:<http://docrecherche.ca/new/springer/Springer-FAQ-7-04.pdf>
[21] IEEE
Enterprise. Home page
URL: <http://www.ieee.org/portal/site/discover/menuitem....>
[22] Scopus. Homepage.
URL:<http://www.scopus.com/scopus/home.url>
Tony Kidd, “Electronic Journal Usage Statistics: Present Practice and
Future Progress” In: Statistics in practice – measuring & managing,
(Papers from an international array of specialist contributors to the IFLA
Satellite Conference, held at Loughborough University in August 2002), LISU
Occasional Paper No. 32, May 2003. p.67-72,
URL:<http://www.lboro.ac.uk/departments/dils/lisu/downloads/statsinpractice-pdfs/kidd.pdf>
Judy Luther, White Paper on Electronic Journal Usage. (Council
on Library and information resources, 2000)
URL:<http://www.clir.org/pubs/reports/pub94/contents.html>
Judy Luther. “Getting statistics we can use”. In: Meaningful
measures for emerging realities. Proceedings of the 4th Northumbria International
Conference on Performance Measurement in Libraries and Information Services,
Pittsburgh, PA, August 12-16, 2001, ed. Joan Stein, Martha Kyrillidou and
Denise Davis (Washington, DC: Association of Research Libraries, 2002), p.321-330.
URL:<http://www.libqual.org/documents/admin/luther.pdf>
Magaly Báscones Dominguez studied law and information sciences. This article is based on her activities whilst working at the Periodicals Unit of the CERN Library. At present, she works as an Associated Information Officer at the United Nations High Commissioner for Refugees (UNHCR) Library and Visitors’ Centre.
|