SELF-ENHANCEMENT IN SCIENTIFIC RESEARCH: THE SELF-CITATION BIAS

A typical psychology article contains 3 to 9 self-citations, depending on the length of the reference list (10% of all citations). In contrast, cited colleagues rarely receive more than 3 citations. This is what we call the self-citation bias: the preference researchers have to refer to their own work when they guide readers to the relevant literature. We argue that this finding is difficult to understand within the traditional, science-based view, which says that reference lists are there to help the reader. It is more easily understood within a social view of reference lists which argues that scientists form groups and that reference lists partly reflect well-known phenomena in social psychology and group dynamics. Within this view, the self-citation bias is a self-serving bias motivated by self-enhancement and self-promotion.


The self-citation bias in psychological science
Scientific publications are a never-ending source of inspiration, not only due to the information they contain but also because of the formal characteristics they adhere to.In particular the reference lists have been scrutinised recently with some quite remarkable findings.Below we summarise first the available evidence and we look then more specifically at the number of self-citations in journal articles and the reasons why authors cite themselves.

The traditional, science-based view of reference lists
Readers of scientific articles expect an article's reference list to comprise information about the publications they need for a good understanding of the article's contribution to the field (i.e., the cumulative nature of science) and for a replication of the reported studies if they wish to do so (i.e., the replicability of the findings).From this perspective, reference lists are at the readers' service, to help them find critical information.We will call this the traditional, science-based view of reference lists.
An interesting study within the traditional view has been published by Adair and Vohra (2003).Among other findings, they reported that the number of references in psychology journals increased considerably between the early 1970s (M = 13) and 2000 (M = 54).Another finding was that in the same time period the percentage of references to 'old' publications (published more than 20 years before) rose from 5% to 19%, reflecting the fact that psychology became a more mature science in the second half of the 20 th century.
The traditional view of references also lies at the heart of esteem measures based on numbers of citations.The best known of these is the journal impact factor, calculated on the basis of the number of citations made to articles published in the previous 2 years.Other measures are the total number of citations per author, research group, or institute, and -more recently -the h-index (Hirsch, 2005).Many of these measures can readily be obtained from sources such as the ISI Web of Science or Scopus.The idea behind them is that the more a publication is referred to the more important it is.Conversely, an author who publishes a lot but is never cited cannot be expected to make a large difference in the field.

The social view of reference lists
There are several indications that the traditional, science-based view of reference lists does not provide a full explanation.References are not always included because they are essential to understand the argument or because they are the best source of information.Sometimes they are included (or excluded) for reasons that are easier to understand from the perspectives of social psychology and group dynamics than from a pure scientific point of view.
For instance, Lange and Frensch (1999) noticed that researchers who became editor of an (American) journal saw their number of citations in that journal increase more than could be expected on the basis of their performance alone (as measured by the number of citations they received from journals of which they were not the editor).Although the reasons for this increase may be multiple (e.g., authors may be more likely to submit their manuscript to a journal with an editor they know and respect), it is not unreasonable to assume that part of the increase is due to the editor's reward power and tactics used by the authors to increase their chances of getting published, phenomena that are well-known within the social psychology of group dynamics (e.g., Snyder & Stukas, 1999).In addition, everyone with publication practice will have experienced that some reviewers and editors are quite helpful in providing extra references.These suggestions are not always without self-interest.Editors, for instance, have an interest in suggesting (recent) articles from their own journal, as this may increase the impact of the journal.Pasterkamp, Rotmans, de Kleijn, and Borst (2007) discovered another regularity in references.They examined the major cardiovascular journals and observed that authors were more often cited by authors from their own country than by authors from other countries (self-citations excluded).For instance, USA authors gave an average of 4.1 citations to USA articles vs. 1.6 to articles from other countries.In contrast, the other countries each added on average 0.6 citations to USA articles vs. 0.9 citations to articles from their own country.Although again there may be different reasons for this observation (e.g., researchers may have more interactions with colleagues from their own country), the phenomenon strikes a chord with the well-known social phenomenon of in-group favouritism, the tendency to estimate members of the own group higher than members of other groups and to favour them when distributing positive outcomes.
Finally, a striking aspect of reference lists (and one that is of particular interest to the current study) is that there is no shortage of self-citations.Falagas and Kavvadia (2006) noticed that articles in biomedical journals contain on average 6 self-citations on a total of 39 references (15%).A similar observation was made by Hyland (2001), who in addition noticed that the percentage of self-citations is higher in the hard sciences (biology, engineering, physics: 12%) than in the soft sciences (marketing, sociology, applied linguistics, philosophy: 4%).
Within the traditional, science-based view of reference lists the high number of self-citations indicates that these references are critical for a good understanding of the text.Falagas and Kavvadia (2006) and Hyland (2001), however, interpreted them more in terms of self-praise.They hypothesised that authors include a large number of self-citations to promote and praise themselves.Indeed, self-enhancement and self-serving biases are well documented phenomena in social psychology.They are biases to protect and enhance the self-esteem (e.g., Mezulis, Abramson, Hyde, & Hankin, 2004).Most students in the Western world, for instance, are convinced they are (slightly) better than the average student, just like most people are convinced that they are better car drivers than average.In addition, people are particularly attracted to things that refer to themselves, as shown in a study by Brendl, Chattopadhyay, Pelham, and Carvallo (2005).They gave participants two brands of tea to choose from, one with a name that referred to their own name and one with a name that referred to another person.The participants predominantly preferred the tea referring to themselves.So, Larry thought that the Larin tea was better, whereas Sandra had a preference for the Sanya tea (although both were the same tea).
According to the social view of reference lists, the high number of selfcitations says more about the social functioning of researchers within their group of scientists than about their courtesy to serve the reader.Researchers include self-citations partly because they think they are better than the average researcher and because they like articles that refer to themselves.Another socially motivated reason is that researchers actively want to promote their findings.Evidence for this is found in the fact that self-citations are particularly frequent in the first years after the publication of an article (Fowler & Aksnes, 2007;Glänzel, Thijs, & Schlemmer, 2004).Finally, self-citations are also helpful for the impact of an author (e.g., they can enhance the h-index of the person). [2] Below, we discuss our attempt to gather more evidence for the social view.A problem with the existing evidence is that it simply points to the high number of self-citations.It does not allow us to dissociate between the number of self-references that are critical for the understanding of the article and the number of self-references that are motivated by self-enhancement and self-promotion.To distinguish between these two, one must show that the number of self-citations is substantially higher than the number of references to other important researchers in the field.This can be done by yoking target articles to related articles written by different authors and by comparing the number of self-citations with the number of cross-references.

Method
Four journals were scrutinised: Psychological Science, the Journal of Experimental Psychology: Learning, Memory, and Cognition, the Quarterly Journal of Experimental Psychology, and the European Journal of Cognitive Psychology.Of the first two journals, one issue was analysed (respectively the last one of 2006 and the first one of 2007); of the other two, we took two issues (the last of 2006 and the first of 2007) as the number of articles per issue was rather small.Psychological Science differs from the other journals, because the maximum number of references is capped to 50 for a general article and to 30 for a short research report.
For each article we searched in the ISI Web of Science for the article with the largest overlap in number of citations authored by a different group of authors (by making use of the 'find related records' button).This criterion has the main advantage that it is objective and easily replicable.It does have some drawbacks, though.For a start, it tends to favour review articles with a large reference list (as the likelihood of overlapping references increases with the number of references in the reference lists).Second, it is not a fool-proof guarantee that the articles are dealing with the same topic (although we did not observe any conspicuous oddities in this respect).Third, it is not impos-sible that some related records came from authors, who in the past collaborated with each other or who were working in the same institute.However, we thought that these limitations outweighed the possibility that handpicking the best-matching partners could introduce unwanted biases.
A self-citation was defined as an entry in the reference list of which at least one of the article's authors was (co-)author.A cross-reference was defined as an entry in the reference list of which at least one of the authors of the related article was (co-)author.So, the number of self-citations and crossreferences refer to (groups of) persons, not to specific articles.

Results
Table 1 lists the main findings.Although there are some small variations across sources, a pretty consistent picture emerges: Psychological researchers include some 10% self-citations in their reference list.This is in line with the figures reported for the hard sciences (Hyland, 2001) and slightly below those of the biomedical sciences (Falagas & Kavvadia, 2006).The number of selfcitations is well above the number of citations to 'relevant others', except in the European Journal of Cognitive Psychology.In the target articles, which were mainly empirical articles, there was a difference of 2 references; in the related articles, which were mainly review articles, the difference went up to 7.
A finding that is hidden in Table 1 is the wide variability in the percentages of self-citations between the articles.Table 2 gives these figures.They show that for all journals the percentages of self-citations range from 0% to roughly one third.There even was some evidence for a negative correlation between the percentage of self-citations and the percentage of cross-references (Spearman's rho = -.21,n = 136, p = .016).
To find out whether the differences in percentages of self-citations are purely defined in terms of situational variables (the article's topic, the jour- nal, the specific combination of authors, …) or whether they also reflect some stable characteristic of (groups of) authors, we searched for articles by (one of the) authors that were at least 10 years old.By using such a large time span, we excluded short-term variations and we limited our analysis to 'established' researchers.In addition, we restricted the analysis to articles with at least 10 citations in the reference list, in order to exclude short comments.We were able to locate matched article pairs for 56 of the original 68 articles.The mean percentage of self-citations in the period 1994-1997 was 12% (SD = 11.2), which did not differ significantly from the percentage in 2006-2007 (15%, SD = 11.2,Wilcoxon Signed Ranks test Z = -1.38,p = .168).The Spearman rank correlation between the percentages of selfcitations was .30,which was significant at the .05level (n = 56, p = .025).

Discussion
A typical psychology article contains 3 to 9 self-citations, depending on the length of the reference list (10% of all citations).In contrast, cited colleagues in general receive 1 to 3 citations.This is what we call the self-citation bias: the preference researchers have to refer to their own work when they guide readers to the relevant literature.There are large differences between articles in the number of self-citations, ranging from 0% to more than one third in the journals we looked at.These differences are to a large extent article-dependent, although there was a significant correlation over a period of 10 years between articles co-authored by at least one researcher.This may indicate that some (groups of) authors are more likely to include a higher or lower number of self-citations.Although this stability could point to personality factors, it might also be due to the issue under investigation (e.g., some topics may only be examined by a small number of authors).
We argue that the difference between the number of self-citations and the number of citations to colleagues is easier to understand within a social view of reference lists than within the traditional, science-based view.Researchers include a number of self-citations in their articles not because the self-citations are necessary to understand the argument or because they are the best state-of-the-arts, but because they are good for the researchers' esteem, by means of self-enhancement and self-promotion.As indicated in the introduction, the social perspective also provides a ready account why authors are more likely to cite colleagues from their own country (the in-group bias) and why they are more likely to include a reference to the editor of the journal (social tactics related to reward power).Aksnes (2003) made an in-depth analysis of self-citations in close to 50,000 articles co-authored by Norwegian researchers and covered by the ISI Science Index.The first finding was that after 5 years 10% of the articles had not received any citation at all (not even a self-citation).Of the remaining, 71% had one or more self-citations.The number of self-citations increased the more citations an article had but at a slower pace, so that overall the least cited articles had a higher share of self-citations (29%) than the most cited articles (18%).Articles with many co-authors had more self-citations than article with a single co-author, and the increase was nearly linear going from 1.5 self-citations for a single-authored paper to 10 self-citations for an article with 15 authors.However, because at the same time the total number of citations increased, the overall percentage of self-citations was roughly independent of the number of authors.Self-citations have a particularly high impact in the first three years after publication, arguably because the authors are then promoting their new paper whereas few other researchers have come across it yet.This is one of the reasons why up to half of a journal's impactfactor can be due to self-citations, in particular for journals with a low impact score (Anseel, Duyck, De Baene, & Brysbaert, 2004).
For the correct interpretation of Aksnes's (2003) findings, it is important to keep in mind that these analyses were based on a bibliometric database (i.e., the ISI Web of Science) and not on the references lists themselves.The share of self-citations seems to be higher in bibliometric databases (15-30%, depending on the time period taken into account) than in the raw reference lists (10-15%).The reason is that in the former case the proportion of selfcitations is calculated as the number of times a particular article is cited by its authors over a relatively short time period after publication relative to the total number of citations that article receives in that period (and database), whereas in the latter case the proportion of self-citations is calculated as the number of times the authors refer to publications of themselves relative to the total number of articles they cite in the article.This difference in definition explains quite a lot of the discrepancy in the estimates of self-citations reported in various studies.
Another question is whether self-citations pay off.Is a high number of self-citations idle boasting or does it help to advance the authors' case?The first article to look at this aspect (Medoff, 2006) reported no promotion due to self-citations.Articles that were cited by their authors got the same number of cross-references as articles that were not self-cited.This was based on 400 articles in economics journals.A more recent analysis by Fowler and Aksnes (2007), based on over 64,000 Norwegian publications, however, came to a different conclusion.Authors who rarely cite themselves, in the long run receive less cross-citations than authors who regularly self-cite.On the basis of a mathematical model, Fowler and Aksnes estimated that each self-citation yields an extra 3.6 cross-citations in a 10-year period.So, although self-citations may not increase the likelihood that a particular article is cited (Medoff, 2006), they do increase the chances that a particular author is cited.There is also some evidence that including self-citations in a submitted manuscript increases the chances of getting the manuscript accepted for publication by the reviewers and the editor (Campanario, 1998).Based on these findings, researchers do indeed seem to have an incentive to promote their own work.
It is not clear whether editors should take action about self-citations.On the one hand, given the large individual differences and the fact that self-citations from a certain point on have more to do with self-promotion than with the advancement of science, editors may want to cap the maximum number of self-citations to, say, 20%.There is some suggestion that this may be particularly relevant for journals limiting the total number of references.Indeed, a look at Tables 1 and 2 suggests that the percentage of self-citations is highest in Psychological Science, the journal that caps the number of references.Arguably, when faced with citation limitations, researchers are more likely to cut the number of cross-references than the number of self-citations.On the other hand, limiting the number of self-citations may have some knock-on effects on the impact-factors of psychology journals and on the h-indices of authors and institutes, disadvantaging psychologists relative to their colleagues from less scrupulous disciplines.. Research by van Raan and colleagues (Costas, Bordons, van Leeuwen, & van Raan, 2009;van Raan, 2008) suggests that in particular research topics with few investigators would suffer from a cap on self-citations.

Table 1
Number of references, self-citations and cross-references in four psychological journals