Consuming Article-Level Metrics: Observations and Lessons

The Journal Impact Factor (JIF; owned and published by Thomson Reuters)[1],[2] is a summation of the impact of all articles in a journal based on citations. Publishers have used the JIF to gain recognition, authors are evaluated by their peers based on the JIF of the journals they have published,[3] and authors often choose where to publish based on the JIF.

The JIF has significant flaws, including being subject to gaming[4] and not being reproducible. [5] In fact, the San Francisco Declaration on Research Assessment has a growing list of scientists and societies that would like to stop the use of the JIF in judging work of scientists. [6] An important critique of the JIF is that it doesn’t measure the impact of individual articles—clearly not all articles in a journal are of the same caliber. Article-level metrics measure the impact of individual articles, including usage (e.g., pageviews, downloads), citations, and social metrics (or altmetrics, e.g., Twitter, Facebook). [7]

Article-level metrics have many advantages over the JIF, including:


Article-level metrics are largely based on data that is open to anyone (though there are some that aren’t, e.g., Web of Science, Scopus). If data sources are open, conclusions based on article-level metrics can be verified by others and tools can be built on top of the article-level metrics.


Article-level metrics are nearly real-time metrics of scholarly impact. [7] Citations can take years to accrue, but mentions and discussion that can be searched on the web take hours or days.

Diversity of sources

Article-level metrics include far more than just citations and provide metrics in a variety of domains, including discussion by the media (mentions in the news), discussion by the public (Facebook likes, tweets), and importance to colleagues (citations).

There are many potential uses for article-level metrics, including:


As article-level metrics rise in use and popularity, research on article-level metrics themselves will inevitably become a more common use case. Some recent papers have answered the questions: How do different article-level metrics relate to one another? [8], [9] What is the role of Twitter in the lifecycle of a paper? [10] Can tweets predict citations? [11], [12] These questions involve collecting article-level metrics in bulk from article-level metrics providers and manipulating, visualizing, and analyzing the data. This use case often requires using scripting languages (e.g., Python, Ruby, R) to consume article-level metrics. Consuming article-level metrics from this perspective is somewhat different than the use case in which a user views article-level metrics hosted elsewhere in the cloud. This use case is the target use case with which this paper is concerned.


Some scholars already put article-level metrics on their CVs, usually in the form of citations or JIFs. With the rise of article- level metrics, this will become much more common, especially with initiatives like that of the U.S. National Science Foundation (NSF) that now allows scholars to get credit for products, not just papers—and products like videos or presentations cannot be measured by citations or JIFs. This use case will involve scholars with a wide variety of technical skills and will be made easy with tools from ImpactStory or other providers.[13]


Scholars cannot possibly find relevant papers efficiently given that there are now tens of thousands of scholarly journals. Individual article-level metrics components can be used to filter articles. For example, many scientists use Twitter and are more likely to view a paper that is tweeted often—in a way, leveraging article-level metrics. Article-level metrics can also be used to filter more directly. For example, article-level metrics are now presented alongside papers, which can be used to make decisions about what papers to read and not to read. Readers may be drawn, for example, to a paper with a large number of tweets or blog mentions.

In this paper I discuss article-level metrics from the perspective of developing and using scripting interfaces for article-level metrics. From this perspective, there are a number of considerations:

  1. Where can you get article-level metrics data
  2. Data consistency
  3. Data provenance
  4. Article-level metrics in context
  5. Technical barriers to use

Article-level metrics data providers

There are a number of publishers now presenting article- level metrics for peer-reviewed articles on their websites (for examples, see Wiley-Blackwell, Nature, Public Library of Science (PLOS), Frontiers, and Biomed Central). Most of these publishers do not provide public facing APIs (Application Programming Interfaces—a way for computers to talk to one another) for article-level metrics data, but instead use aggregators to provide article-level metrics data on their papers. One exception is PLOS, which collects its own article-level metrics and provides an open API to use this article-level metrics data.