High Energy Physics Libraries Webzine

Editorial Board
Issue 11

 HEP Libraries Webzine
Issue 11 / August 2005

Applying Usage Statistics to the CERN E-journals Collection:

a Step Forward

Magaly Bascones Dominguez(*)


The terms regulating usage statistics from publishers establish that the data are only for institutional use and are not meant to be shared with other institutions. For this reason the figures in this paper are intended only to illustrate the results.


Despite the existence of ICOLC and COUNTER guidelines recommending to publishers the statistics that would be useful for examining e-journal usage, there is little consistency in the data produced. At CERN we have looked at the data offered and have analyzed it using some new techniques. Use of different elements (abstracts, tables of contents, search pages and the articles themselves) shows some interesting differences between titles. Counting the total full-text downloads from individual titles has identified unsubscribed titles (accessed via a consortial agreement) with high use but without more detailed data we cannot know who is using them or how the articles are being used and therefore whether or not we should subscribe to the title in future. However, studying in detail the high use of one subscribed title has shown that there is heavy download of articles by CERN authors. The data is limited in its usefulness in understanding the use of e-journals in general. More detailed data at an article level and by IP address is required. A closer collaboration between librarians and publishers could identify this data and make possible interpretations of the e-journal use in context in order to gain a real understanding.


New information technologies have had many influences on the information world. E-resources (electronic journals, databases, web-resources and e-books) are present more and more in libraries offering users alternative access points to information. In this digital environment, licensed electronic information is increasingly the dominant source of content and because of it, librarians have lost direct control over their collection and must rely on aggregators and publishers for the supply of collection usage measurements.

E-journals have been present in the CERN Library catalogue since 1998. Today, the collection of e-journals (including online-only and those having a print counterpart) comprises about 1517 titles (Source CDS). These resources have been catalogued, displayed and promoted and an acquisition and management policy has been implemented [1]. The existence of e-journals in the CERN Library during the last ten years calls for some reflection on their impact and potential.

E-journals are generally just as expensive as printed journals and the number of titles available electronically and the add-on services continually increase. For research institutions and their libraries, there is not only the question of balancing a limited overall budget but also of how the research budgets can best be used. In fact, research institutions are confronted with having to pay for both the research itself and for the published results through journal subscriptions. As Dallman explains:

“Concerning journal subscriptions, we find the situation for some particle physics core journals (such as Physics Letters B) rather bizarre. The community pays large sums of money to build the accelerators and carry out the experiments, and the editors and referees are also researchers in the field who carry out these tasks in their research time. But then we are asked to pay more and more to buy back the results we have produced, without which there would be no journal to publish in the first place” [2]

In view of the problems of those institutions which effectively have to pay twice and of the possibility of a solution that alternative (open access) electronic publishing models might offer, we need some clarity of the situation in order to take future decisions about publishing scientific articles. And, these decisions should be informed by knowledge of the use of e-journals. Several studies on the integration of e-resources into library environments and the creation of electronic libraries can be found but at present, the use of e-journals still poses more questions than there are answers, particularly concerning preservation, distribution and commercialization. In addition, we do not really know how to effectively measure usage.

Electronic technology provides an opportunity for improving journal usage measurement at least for e-journals. Measurements from the vendor's web site are possible and librarians and publishers could profit from a better understanding of the use of electronic information to feed into their management decisions. Indicators could be used in libraries for cancellation and purchase decisions, learning more about users habits (how they use resources, identification of areas where training might be appropriate, file format preferences e.g. PDF or HTML), and evaluation of performance [3]. Publishers use statistical data to aid their editorial decision-making and to see the usage via routes/interfaces [4].

Usage statistics for e-journals are sent from most publishers to purchasing libraries or consortia. The formats and indicators of these statistical reports are based on the ICOLC (International Coalition of Library Consortia) guidelines [5] or on COUNTER (Counting Online Usage of NeTworked Electronic Resources) recommendations [6]. ICOLC’s Guidelines for statistical measures of usage of web based information, published in 1998, define the indicators that should be provided, and they invite a commitment from information providers to supply reports to libraries. They described the need for the library community to measure usage of the new resources. As an answer to this appeal information suppliers started to provide usage statistics to libraries.

In 2002, the COUNTER Code of Practice was launched, developed by publishers and librarians. This Code of Practice is an attempt to standardize usage statistics reports according to different levels of accomplishment. For achieving level 1, publishers have to provide reports that include the number of full-text articles requested by journal and by month as the main indicator. The choice of this indicator over others is based on the idea that journal users are mainly interested in reading articles. The draft of Release 2 of the COUNTER Code of Practice [7] continues to focus on the measurement of full-text requests but adds to level 1 a differentiation between full-text versions in PDF and HTML formats.

Publishers started to deliver usage statistics reports including data corresponding to the indicators mentioned by ICOLC, for example tables of contents requests, abstract requests, total sessions, article downloads, turn-aways, searches, etc. and libraries quickly became aware of the difficulty of comparing data from different vendors.

However, a lack of standardization amongst the data offered and of the metric definitions are the main difficulties for comparison and analysis. The result is that today, some publishers only achieve COUNTER Code of Practice level 1, standardizing their reports but reducing their indicators to only two.

Statistics are therefore still not consistent. The absence of tools to analyze and compare the data is another gap. We do not think that this situation should stop analysis, but that finding tangible examples to study is a way to go forward.

We would like to contribute to the study of usage statistics of e-journals and so have looked at some new methods for doing so. Our main goal is to understand the use of the CERN e-journals collection. In order to have an accurate analysis we will take into consideration:

The article is divided into three parts. The first section presents the context: CERN and its Library e-journals collection. The second section presents the ICOLC, COUNTER and publishers’ reports. The third section presents usage statistics and the different approaches for comparing and analyzing data.

We will not explore previous e-metrics research in this paper. There are various studies and projects on this subject, for example, the e-metrics project of the Association of Research Libraries (ARL) [8] which began in 2000 and other efforts in e-metrics (e.g. [9], [10], [11], [12], [13], [14], [15]). However, what is clear is that there is no certitude about what is measured by the action of a click, a browse or a print. Without more precision on this point it is difficult to draw conclusions.

CERN Context

CERN is the European Laboratory for Particle Physics located in Geneva. It provides scientific research facilities to some 3000 permanent staff and 6500 visiting scientists. It is the world's largest particle physics centre, used by about half of the world's particle physicists. The users represent 500 universities from over 80 countries. The CERN staff design and build CERN's intricate machinery and ensure its smooth operation. They help prepare, run, analyze and interpret the complex scientific experiments and carry out the variety of tasks required to make such a special organization successful. Constructing these highly advanced machines is an extremely costly operation. CERN publication policy states that all results from CERN experiments and research are made public. Most of the scientific papers written by CERN collaborators are available through the library or the CDS (CERN Document Server) [16].

The CERN Library offers an e-collection of preprints, published articles, theses, annual reports, conference proceedings, databases, Internet resources, e-journals and e-books. The CERN Document Server (CDS) hosts e-versions of full-text article preprints and links preprint records with the published version if a CERN subscription to access that journal exists. The acquisition policy is elaborated by the Working Group for Acquisitions and the Scientific Information Policy Board.

The CERN Scientific Information Service has been active in the field of digital library research and in providing scientific information services to the high-energy physics community for almost five decades now. The research focus has been on interoperability issues for document storage and retrieval systems, metadata added-value services, digital library automation and networked information services [17], [18].

The periodicals catalogue contains separate records for printed journals and for electronic versions with more than twice as many of the latter as of the former. The e-journals are either online versions of printed titles or electronic-only journals (listed at http://cdsweb.cern.ch/?c=Periodicals&as=0&ln=en and http://library.cern.ch/electronic_journals/ej.html respectively). The subjects covered are high energy physics and related subjects such as astronomy, mathematics, physics in general, computer science, electronics, etc. Online journal subscriptions are obtained through site or consortium licenses. Since 2002 the CERN Library has been a member of the Swiss Consortium of Academic and Research Libraries.

ICOLC, COUNTER and publishers’ reports

The ICOLC ‘Guidelines for Statistical Measures of Usage of Web-Based Indexed, Abstracted, and Full Text Resources’ [5] ask for six types of information to be produced on a regular basis :

The COUNTER Code of Practice for journals [6], [7] establishes recommendations for two levels of accomplishment and for the delivery of four reports containing:

Under the COUNTER Code of Practice publishers are required to deliver at least the level 1 data referring to full-text requested and turn-aways. For publishers and vendors, these are the indicators of primary concern.

Publishers’ reports

Publishers started to deliver usage statistics reports including data corresponding to the indicators mentioned by ICOLC, for example: requests for tables of contents, abstracts, sessions, article downloads, searches and turn-aways. The multiplicity of data offered was the first difficulty for the comparison of the reports. Later, publishers moved to the COUNTER Code of Practice preferring to offer level 1 reports (reports 1 and 2 and sometimes report 3). By following the COUNTER Code of Practice, some publishers could simplify their statistical reporting but at the same time, unfortunately for libraries, reduce the number of indicators. Most publishers deliver reports on a monthly basis, but their interfaces and report formats vary from one to another. Table 1 summarizes the indicators offered by some publishers for each journal title. We note that all publishers offer the indicator of total full-text requested. These data give us the possibility of answering questions such as: is there a relationship between journal usage and subscription price; what is the relationship between impact factor, price and usage? But the data concerning more specifically the use of the publishers’ interfaces and services, such as the number of sessions, the entry pages and exit pages, are not given for any journal title.

Table 1: Data availability by journal and publishers compliant with the COUNTER guidelines


i) Searches by database and for ScienceDirect
ii) Not available for CERN
iii) For all collections
iv) PDF / HTML
v) By journal

Comparing and analyzing the data: usefulness and limits

The following problems came up when studying our local usage statistics:

In order to understand the use of the CERN e-journals collection, we use three different approaches to the statistics:

a) Journal elements usage: comparison of the use of four journal elements for a sample group of subscribed journals — table of contents (TOC), abstracts, search and full-text articles requested.

b) Journal usage comparison using a single element: articles (full-text requested or downloaded)

i. Comparing journal usage by publisher
ii. Comparing journal usage from a single publisher over a 12-month period

c) Article usage within a single journal:

i. Comparing article usage and CERN publishing in one journal
ii. Comparing article usage, total articles and CERN articles in the core HEP journal collection

a) Journal elements usage

When we started the study we were trying to identify and to analyze the use of e-journals. We compared four indicators: table of contents, abstracts, search and full-text articles requested. It is important to know that in the CERN e-journals catalogue, there are direct links to these elements.

For our study we have analyzed 12 months of statistics for the AIP (American Institute of Physics) journals in the CERN collection. We chose those journals because AIP offers all four indicators.

Figure 1: Journal elements usage: AIP

Figure 1 shows that:

This information is interesting because it shows the users’ behaviour. It could be useful for the design of user interfaces or to select new e-services mentioned in the catalogue but not subscribed — usage of tables of contents and abstracts of unsubscribed titles is used by some libraries for justifying new purchases [4]. However, this approach is limited; it shows how an e-journal is used but it does not answer the question of what is used within the journal and when.

b) Journal usage comparison using a single element: full-text requested or downloaded

As we can see in Table 1, the number of full-text articles requested is the only indicator offered by all publishers, so we decided to follow our research using this data.

Comparing journal usage by publisher

Our idea was to compare the e-journals collection by publisher over a determined period of time, in order to have a global view of this usage.

Figure 2: CERN e-journals collection — 12 months usage

Comparing journal usage by publisher allows us to compare one year’s activity with the previous year’s activity. This figure is also useful for license negotiations by a library or a consortium. These negotiations benefit from having information on how much a publisher’s collection is used or not. In the case of consortia it is difficult to try to compare usage across different libraries because each library collection and its use is determined by the library’s context (needs, resources and users).

For CERN, this approach is limited because publishers produce different numbers of relevant titles for our scientific community. We subscribe to more titles (and thus have more usage) from one specific publisher than from another. This result is also influenced by access through the consortium to titles we might not otherwise be able to afford and by the level of digitalization of back files offered by the publishers.

Comparing journal usage from one publisher over a 12-month period

As noted above, comparing journal usage by publisher shows roughly what is used in the CERN e-journals collection, but our knowledge about which journals are used is still vague. To investigate further we studied journal usage from one publisher over a 12-month period. Figure 3 shows Kluwer (now Springer) journal titles accessed by consortium license, some of which are also subscribed to in print. There are again obvious limitations to this approach as different journals cannot be compared, we cannot compare a specialized quarterly with a weekly general subject journal.

However, from figure 3 we note:

  1. The use of the non-subscribed title (i.e. Space Science Reviews), never previously considered as being of interest to our users.
  2. No use of that same title between September and December.
  3. Regular use of a subscribed title: Hyperfine interactions.

Figure 3: full-text downloads of Kluwer (now Springer) journal titles

1) The case of Space Science Reviews
CERN Library has never subscribed to Space Science Reviews but CERN users can access the online version through the Swiss Consortium of Academic and Research Libraries license. The large amount of full-text use registered in a short time period (figure 4) raises interest. This kind of use at CERN could correspond to the activity of a particular experiment taking place in the Laboratory. In order to investigate this hypothesis we would need information about the IP addresses that have accessed and downloaded the articles, but unfortunately this information is not given by the publishers because of the protection of privacy. This concern over privacy represents an obstacle for knowing more about the needs and interests of the research carried out locally. In any case, at CERN not all IP addresses are linked to one specific user; many scientists visit CERN for short periods and the IP addresses are linked to departments, units and collaborations.

Figure 4: full-text downloads of Space Science Reviews

2) No use of Space Science Reviews between September and December 2002
As previously hypothesized, the use of this title may reflect the interest of a particular group of scientists not always physically present on the site or erratic research on a tangential subject to high energy physics, like space science. Or it could reflect the publication of one or more high energy physics articles in this journal.

3) Regular use of a subscribed title: Hyperfine Interactions.
Figure 5 shows the usage of titles also subscribed to in print.

Hyperfine Interactions has been held in print in the Library for many years and online access is now given through the consortium license. We clearly see a regular and high use of Hyperfine Interactions. A more detailed analysis at the article level might lead us to find an explanation of how this journal is used.

Figure 5: full-text downloads of Kluwer titles subscribed to in print

c) Article usage within a single journal:

Comparing article usage and CERN publishing in one journal
In this part of our study, we connect the figures concerning full-text downloads from Hyperfine Interactions to the total number of articles published by CERN authors over the same period of time in this journal.

Hyperfine Interactions is a journal devoted to research in the border regions of solid-state physics, atomic physics and nuclear physics and relevant areas of chemistry. CERN scientists from one particular experiment (ISOLDE) have published articles in this journal.

Kluwer offered statistics on the most-used articles each month. The articles were presented by URL. We selected all the URLs representing articles in Hyperfine Interactions and then we compared the requests for full-text of articles by CERN authors with the requests for full-text of articles by non-CERN authors. The data relating to the author affiliation was locally generated.

Figure 6: Full-text requests from Hyperfine Reactions by each article’s author affiliation

Comparing article usage and CERN publishing in a single journal gives us some interesting information on the deeper use of a journal. In figure 6, we note that CERNís users download articles written both by CERN-affiliated authors as well as non-CERN-affiliated authors. However, as figure 7 shows, articles published in 2002-2003 by CERN authors represent only 4% of the total articles published in this journal but this represents 40% of the total downloads.

Figure 7: Articles in Hyperfine Interactions written by CERN authors in 2002-3 as a proportion of total articles and articles downloaded in the same period by author affiliation.


In the context of the open access debate and the new business models proposed by some publishers concerning an author or institution publication fee, we can ask how much the publication of those articles would cost for CERN and then compare this figure with the current subscription cost. The open access article fee of the Springer Open Choice model [19] is 3000 US dollars. An easy calculation shows us a very high cost for the sum of these articles compared with the subscription cost. We wonder how institutions will be able to handle and organize this payment.

Comparing article usage, total articles and CERN articles in the core HEP journal collection
Following our previous study we decided to use a similar approach on the core high energy physics titles, the main subject area for CERN. We compared (by journal and by 12-month period) the total number of full-texts requested with the total number of published articles in the journal and also the total number of published articles by CERN authors. The latter two figures are not given by the publishers so this data had to be generated locally.

Figure 8: Comparison of full-text downloads, total published articles and articles published by CERN authors in core high energy physics titles

Figure 8 raises more questions. For certain journals, the total number of full-text downloads exceeds not only the number of articles by CERN authors, but also the total number of published articles. And the question behind the present study as to what is really used remains crucial and unanswered. We can formulate some hypotheses such as:

Unfortunately, we do not have the data that would confirm these hypotheses. To answer these questions we would need more article-oriented data. Unfortunately only one publisher, Kluwer, offered this information and after the merger with Springer (a COUNTER level 1 compliant publisher), we fear that it will no longer be provided:
“ Will the user statistics be merged and put on one platform? If so, when? As part of the migration of all content to a single platform, we will also merge all of our usage reports to a common format. We also expect this to be completed by the end of 2004” [20].


- Statistics for e-journal usage are useful but the publishers’ reports are often insufficient for full analysis. Our approach would lead to different interpretations if applied to other libraries and contexts. Publishers’ reports do not satisfy the needs behind the present open access debate. New information products with costs based on the numbers of articles downloaded (IEEE Enterprise [21] and SCOPUS Elsevier [22]) reinforce an article-oriented approach more than a journal title approach. These data give only global information on titles and do not give value to articles.

- We would like publishers’ statistical reports to be improved at the detailed, article level and to provide useful information for management decisions, such as:

Measurement of the usage of e-journals and their articles is only possible from the publisher's web site. On the other hand, librarians have the knowledge about context, necessary to analyze data for the development of future services. Publisher revenues depend on budgets justified by librarians; librarians need to respond to readers’ needs as much as possible. To move in the right direction and to increase understanding about usage of published articles, a close collaboration is needed between publishers and librarians.


[1] Eliane Chaney, Catherine Bulliard, Caroline Christiansen and Jean-Pierre Cressent. “Une Bibliothèque de Recherche Face à l'Édition Électronique” Bulletin des bibliothèques de France, 44:2 (1999): 27-32.

[2] David Peter Dallman. “Electronic Journals and Electronic Publishing at CERN: a Case Study” in International Spring School on the Digital Library and E-publishing for Science and Technology, CERN, Geneva, Switzerland, 3 - 8 Mar 2002, p. 4.1-4.10.
Preprint available URL: <http://cdsweb.cern.ch/search.py?recid=597899&ln=en>

[3] Linda S. Mercer."Measuring the Use and Value of Electronic Journals and Books", Science and Technology Librarianship (Winter 2000).
URL:<http://www.library.ucsb.edu/istl/00-winter/article1.html>(visited December 2, 2004).

[4] Simon Bevan and Louise Jones. “Using COUNTER Statistics: a Practical Perspective”, in: "Reports on briefing sessions and workshops held at the 27th UKSG Annual conference", Serials, 17:2 (July 2004): 169.

[5] International Coalition of Library Consortia (ICOLC). Guidelines for Statistical Measures of Usage of Web-Based Indexed, Abstracted, and Full Text Resources (November 1998).
URL:<http://www.library.yale.edu/consortia/webstats.html>(visited December 2, 2004)

[6] COUNTER: Counting Online Usage of Networked Electronic Resources. The COUNTER Code of Practice. Release 1: December 2002.
URL: http://www.projectcounter.org/code_practice.html#start (visited December 2, 2004)

[7] COUNTER: Counting Online Usage of Networked Electronic Resources. The COUNTER Code of Practice — Journals and Databases. Release 2, April 2005 (valid from 1 January 2006). (Draft published on the COUNTER website in April 2004).
URL:<http://www.projectcounter.org/cop2.html>(visited December 2, 2004)

[8] Association of Research Libraries. E-metrics: Measures for Electronic Resources.
URL:<http://www.arl.org/stats/newmeas/emetrics/> (visited December 2, 2004)

[9] International Organization for Standardization (ISO). NISO Z39.7-2004 Draft Standard for Trial Use American National Standard for Information Services and Use: Metrics & Statistics for Libraries and Information Providers — Data Dictionary. Approved: 6th October 2004.

[10] International Standards Organization, 2003. ISO/CD 2789 Information and Documentation: International Library Statistics.

[11] International Standards Organization, 2003. ISO/CD 11620/AMD.1 Information and Documentation: Library Performance Indicators.

[12] International Standards Organization (ISO), 2003. Technical Report 20983: Information and Documentation - Performance Indicators for Electronic Library Services.

[13] EQUINOX. Library Performance Measurement and Quality Management System — 27th November 1998 - 26th November 2000 (Project funded under the Telematics for Libraries Programme of the European Commission).

[14] JISC/Publishers' Association (PALS) Working Group and Joint Working Parties. PALS Working Group. Home page

[15] National Commission on Libraries & Information Science (NCLIS). NCLIS Library Statistics Program (LSP) Home page

[16] CERN Document Server (CDS). Homepage

[17] Carmen O'Dell, David Peter Dallman, J. Vigen and Martin Vesely. “50 Years of Experience in Making Grey Literature Available: Matching the Expectations of the Particle Physics Community” Publishing Research Quarterly, 20:1 (2004), p. 84-91.
Also in: 5th International Conference on Grey Literature: Grey Matters in the World of Networked Information (GL 2003). Amsterdam, The Netherlands , 4 - 5 Dec 2003: p. 117-123.
Also available as preprint: CERN-OPEN-2003-053. URL:<http://cdsweb.cern.ch/search.py?recid=690102&ln=en>

[18] J. Vigen. “Why Go Via Expensive Solutions, When You Can GoDirect? : a Description of How CERN Library Integrates Digital Content”. In: The Digital Library and e-Publishing for Science, Technology and Medicine : practical course information ,CERN, Geneva, Switzerland , 13 - 18 Jun 2004.
Also available as preprint: CERN-OPEN-2004-018, 10p. URL:<http://cdsweb.cern.ch/search.py?recid=780090&ln=en>

[19] Springer Open Choice.

[20] Springer. In: Customer FAQ for the Springer-Kluwer merger, Effective August 1, 2004, Springer.

[21] IEEE Enterprise. Home page
URL: <http://www.ieee.org/portal/site/discover/menuitem....>

[22] Scopus. Homepage.

Further reading:

Anthony W. Ferguson, “Back talk — Use Statistics: Are They Worth It?” Against the grain, 14:6 (Dec 2002 — Jan 2003), p. 93-94.

Tony Kidd, “Electronic Journal Usage Statistics: Present Practice and Future Progress” In: Statistics in practice – measuring & managing, (Papers from an international array of specialist contributors to the IFLA Satellite Conference, held at Loughborough University in August 2002), LISU Occasional Paper No. 32, May 2003. p.67-72,

Judy Luther, White Paper on Electronic Journal Usage. (Council on Library and information resources, 2000)

Judy Luther. “Getting statistics we can use”. In: Meaningful measures for emerging realities. Proceedings of the 4th Northumbria International Conference on Performance Measurement in Libraries and Information Services, Pittsburgh, PA, August 12-16, 2001, ed. Joan Stein, Martha Kyrillidou and Denise Davis (Washington, DC: Association of Research Libraries, 2002), p.321-330.

Author Details

Magaly Báscones Dominguez
UNHCR Library and Visitors’ Centre
Rue de Montbrillant 94
1211 Geneva 2, Switzerland

Tel: +41 22 7363408
Email: bascones@unhcr.ch

Magaly Báscones Dominguez studied law and information sciences. This article is based on her activities whilst working at the Periodicals Unit of the CERN Library. At present, she works as an Associated Information Officer at the United Nations High Commissioner for Refugees (UNHCR) Library and Visitors’ Centre.

For citation purposes:

Magaly Báscones Dominguez, "Applying Usage Statistics to the CERN E-journals Collection: a Step Forward", High Energy Physics Libraries Webzine, issue 11, August 2005

Reader Response

If you have any comments on this article, please contact the  Editorial Board
Editorial Board
Issue 11

Last modified:  26 July 2005