[Image above] Just as there were growing pains in adopting more efficient lighting technology, so too will there be hurdles to embracing more effective evaluation metrics for science research. Credit: Mark jurrens, Wikimedia (CC BY-SA 4.0)


Evaluating the quality of research and researchers is neither easy nor simple, particularly when evaluating fundamental scientific studies. Even with interesting and important results, the road from discovery to societal impact is long and riddled with barriers. Thus, judging research means assigning value to many intangible qualities, such as influence within and beyond academia.

For many years now, those who control the funding of research and the careers of researchers have used mainly quantitative metrics for their decisions. Among those metrics is the journal impact factor and the h-index.

Journal impact factor, which is published by analytics company Clarivate, measures the number of citations that articles receive in the two years following publication. It is an uncomplicated formula designed to measure the short-term impact of each journal. It says nothing about longer influence.

h-index, on the other hand, is a measure of longer-term value, and it can be applied to individual researchers as well as journals. The h-index is the largest number for which h (or more) papers within the evaluation set have h citations. For example, say a researcher publishes 50 papers. Of those 50 papers, 20 papers have three citations (h=3), 12 have seven citations (h=7), 10 have nine citations (h=9), four have 10 citations (h=10), and four have 12 citations (h=12). Because h=9 is the largest value that has at least as many articles, this researcher’s h-index is nine.

There are challenges with both these metrics at the journal level. For example, both increase with increasing numbers of papers published. Furthermore, h-index is influenced with time, such that older journals generally have higher h-index values.

Over the past decade or so, various stakeholders in the research enterprise have recognized the shortcomings of journal metrics in the context of evaluating an individual’s or institution’s quality of research.

For example, the American Society for Cell Biology in San Francisco in 2012 issued the Declaration of Research Assessment (DORA). A core value of this declaration is a call to base research assessment on the value and impact of all research outputs (including datasets and software) in addition to research publications. Plus, it requests that a broad range of impact measures be considered, including qualitative indicators of research impact, such as influence on policy and practice.

However, using simple metrics is firmly entrenched for evaluating research. Changing entrenched practices is difficult even when a new version is better.

Take, for example, the humble lightbulb. For more than 100 years, the incandescent bulb reigned supreme. Initially expensive and unreliable, moving from candles and gas lamps seemed impossible. Yet technology and production improved over the years to create an inexpensive and dependable product.

But in the late 20th century, higher energy prices, stress on electrical grids, and concerns over climate came together to provide the need for higher efficiency lighting. Some sectors of society embraced the change, but the convenience and familiarity of incandescent bulbs were high hurdles to overcome. A new mindset was needed to complement new technology.

Early-stage replacements included compact fluorescent lights (CFLs) and LEDs, which initially were expensive and (particularly CFLs) had aesthetic challenges. Using enabling ceramic technology, LED bulbs now provide similar characteristics to incandescent bulbs at a cost that is marginally higher. Because LEDs consume only about 15% as much electricity as incandescent bulbs, and they last 5–10 times longer, the lower total cost of ownership has led to widespread consumer acceptance.

We are at a similar juncture with researcher evaluation. In 2020, the Chinese government changed the country’s research assessment policies so they no longer rely solely on quantitative metrics (number and prestige of publications) but now also include assessments of portfolios of exemplary publications.

Last year, stakeholders in Europe finalized their Agreement on Performing Research Assessment, which “requires basing assessment primarily on qualitative judgement, for which peer review is central, supported by responsible use of quantitative indicators.” This agreement sets out timelines to create an action plan by the end of 2023 and complete the first round of reviews by the end of 2027.

Because these assessments contain both quantitative and qualitative components, the need remains for transparent and reliable quantitative measures.

In a recent open-access article published in International Journal of Ceramic Engineering & Science, two ACerS journal editors examined various quantitative methods for evaluating the impact of journals in the ceramic sciences, including the h-index, journal impact factor, and the recently introduced MZE-index (defined by Montazerian–Zanotto–Eckert).

The authors are ACerS Fellow John C. Mauro, Dorothy Pate Enright Professor and associate head for graduate education at The Pennsylvania State University, and Maziar Montazerian, visiting professor at the Federal University of Campina Grande in Brazil. Mauro is editor-in-chief of Journal of the American Ceramic Society, while Montazerian is associate editor for JACerS and International Journal of Applied Ceramic Technology.

Briefly, the MZE-index uses a power-law equation to model any quantitative metric (e.g., journal impact factor, h-index) to the number of articles in the evaluation set, defined by either region, topic, or other broader comparison area, and then evaluates the individual data points (e.g., researchers, institutions, journals) relative to the indexing average. Individual data points with positive MZE values (above the average) are said to have relatively higher visibility, and vice versa. It is notable that the three ACerS hybrid journals have positive MZE values; the gold open-access IJCES has yet to attain the underlying metrics used to create the MZE value.

Mauro and Montazerian applied the MZE analysis to 31 journals in or related to the field of ceramics and then compared them to existing indicators, particularly the SCImago Journal Rank (SJR) from Elsevier, which is akin to the journal impact factor, though weighted by the ratings of the citing journals. They found that

  1. Unlike the journal impact factor and h-index, the MZE value is decoupled from the number of articles (i.e., its value does not rise as the number of articles increases). This decoupling helps to normalize the metric, i.e., allows it to reflect quality rather than quantity.
  2. Also because of this decoupling, MZE better captures the short- and long-term performance of journal articles.
Comparison of MZE values to SJRquartiles for ceramics-related journals. While there is general agreement between the two indices, Mauro and Montazerian suggest investigating quality factors for the outlying journals. Graph created by ACerS staff using data from the article. Credit: ACerS

Interestingly, the MZE results correlated reasonably well with SJR quartile rankings with some notable anomalies. The authors point out, for example, that the MZE value of International Journal of Applied Glass Science, which has a Q3 SJRranking, is higher than those of many Q1 and Q2 journals, indicating editorial quality exceeding those determined by other metrics. They show similar results with other small, specialized journals. The authors suggest further investigation into these journals is warranted.

Based on these results, Mauro and Montazerian conclude by emphasizing the importance of adopting new evaluation metrics into the mainstream.

“[Though] easy-to-understand metrics such as JIF [journal impact factor] and the h-index are preferred … They should be used in conjunction with other bibliometric indices, such as the MZE-index and SJR,” they write.

The open-access paper, published in International Journal of Ceramic Engineering & Science, is “Evaluating the impact of journals in the ceramic sciences” (DOI: 10.1002/ces2.10180).

Author

Jonathon Foreman

CTT Categories

  • Basic Science
  • Education