By the time students reach high school, most have learned that they cannot trust everything they read on the internet. But even by the time that students graduate from college, few realize that they cannot trust everything they read in scholarly journals. In this article we discuss how one can distinguish between papers that are relatively trustworthy and those that are best approached with grave skepticism.
Before we get into the details, however, we want to make one thing perfectly clear: Any scientific paper can be wrong. No matter where it is published, no matter who wrote it, no matter how well supported are its arguments, any scientific paper can be wrong. Every hypothesis, every set of data, every claim and every conclusion is subject to re-examination in light of future evidence. Conceptually, this stems from the very nature of science. Empirically, this can be readily observed in the pages of Nature and Science. We've seen it time and again. The most brilliant researchers and the most elite journals have published claims that turned out to be utterly wrong.
But what about peer review? Doesn't that solve the problem?
No. Peer review, while an important part of the scientific process, doesn't guarantee that published papers are correct. Peer reviewers carefully read a paper to make sure its methods are reasonable and its reasoning logical. They make sure a paper accurately represents what it adds to the literature, and that its conclusions follow from its results. They suggest ways to improve a paper, and sometimes recommend additional experiments. But peer reviewers can make mistakes, and more importantly, peer reviewers cannot possibly check every aspect of the work. Peer reviewers don't redo the study, rewrite the code, or even dig too deeply into the data in most cases. Though helpful, peer review cannot catch every innocent mistake, let alone uncover well-concealed acts of scientific misconduct.
As a result, there is no surefire way for you, as the reader, to know beyond the shadow of a doubt that any particular scientific paper is correct. Usually the best you can hope to do is to determine that a paper is legitimate. By legitimate, we mean that a paper is reasonably inferred to be (1) written in good faith, (2) carried out using appropriate methodologies, (3) taken seriously by the relevant scientific community. If a paper is legitimate scholarship it still may turn out to be incorrect, but at least you've done due diligence. For the remainder of this article we consider how you can determine whether a paper is legitimate.
Traditionally, scholars have looked to scholarly journals to confer legitimacy upon their work. In brief, the process works like this. A researcher submits her work to a journal. The work is sent to other scholars in the field for peer review; the reviewers decide whether the paper merits publication, requires revision, or is not of suitable quality for the journal in question. While these reviewers cannot vouch for the correctness of every element of a paper, they are able to judge the reasonableness of the procedures, results, and interpretations.
Researchers view different journals as occupying different positions in a rough hierarchy; this hierarchy is well-known (and probably over-emphasized) among researchers. In general and all else equal, papers published in top journals will represent the largest advances and have the highest credibility, whereas papers published in medium-tier journals will report more modest advances (though not necessarily with lower credibility). Papers in lower-tier journals report the least interesting or least credible findings.
A quick way to evaluate the legitimacy of a published paper is to find out about the journal in which it is published. A number of websites purport to rank journal quality or prestige, typically ascertained based on citations. Highly cited journals are thought to be better than their seldom-cited competitors. Among these ranking systems, the Journal Impact Factor is the gold standard. Journal impact factor measures the ratio of citations received over a two-year window to the number of "citeable" articles published in that same window. While what constitutes an impressive impact factor varies from one field to another, it is a reasonable rule of thumb to consider that any journal listed in the Journal Citation Reports is at least somewhat reputable, any journal with an impact factor of at least 1 is decent, and any journal with an impact factor of at least 10 is outstanding. Unfortunately, impact factor scores only available in bulk via a subscription service known as the Clarivate Journal Citation Reports (JCR).
Several free alternatives to Journal Impact Factor are also available. We the authors provide a set of journal indicators known as the Eigenfactor metrics. These metrics cover the same set of journals included in the JCR. We also provide a Chrome plugin for PubMed that color-codes search results according our metrics. The large commercial publisher Elsevier provides an alternative set of metrics based on their Scopus database. Scopus covers a larger set of journals than the JCR, but we and others have expressed concern about possible conflicts of interest that arise when a major journal publisher begins ranking its own journals against those of its competitors. Google Scholar provides journal rankings in the form of journal h-index scores.
Another reasonable check on the legitimacy of a paper is to examine the appropriateness of a paper's claims to its venue. In particular, one should be wary of extraordinary claims appearing in lower-tier venues. You can think of this as the scientist's version of "if you're so smart, why aren't you rich?" In other words, if your paper is such a big deal, why is it in a low tier journal?
Thus if a paper called "Some weights of riperian frogs" lists the weights of some frogs in the little-known Tasmanian Journal of Austral Herpetology, there is relatively little cause for concern. A table of frogs' weights, while possible useful to some specialists in the area, is far from a scientific breakthrough and is well-suited for the journal in question. If, however, a paper called "Evidence that Neanderthals went extinct during the Hundred Years' War" appears in the equally low-profile Journal of Westphalian Historical Geography, this is serious cause for concern. That finding, if true, would revolutionize our understanding of hominin history, not to mention shake our notion of what it means to be human. Such a finding could easily appear in a very high-profile journal. If it does not, that is a strong indicator that something is not right about the story.
While the examples above are hypothetical, real examples abound as well. For example, in 2012 television personality Dr. Mehmet Oz used his show to promote a research paper purporting to demonstrate that green coffee extract has near-miraculous properties as weight-loss supplement. Despite this remarkable claim that could have enormous influence on hundreds of millions of lives, the paper did not appear in a top medical journal such as JAMA, The Lancet, or NEJM. Rather, it appeared in a little-known journal entitled Diabetes, Metabolic Syndrome, and Obesity: Target and Therapy from a marginal scientific publisher called Dove Press. The journal is not even listed in the Journal Citation Reports. This should set off alarm bells for any reader, and indeed looking closely at the paper the results are based on a clinical trial with an absurdly small sample size of 16, too small a sample to justify the strong claims that the paper makes.
The small sample size and inglorious venue of the green coffee extract paper described above turned out to be only part of the story. The paper was subsequently retracted because its data could not be validated.
While retractions are uncommon, it can be a good idea to check for a retraction or correction before staking too much on the results of a scientific paper. The easiest way to do this is simply to check the paper on the publisher's website or better yet, if it is a biomedical paper, on PubMed. If a retraction has occurred it will be clearly marked as such in these places. Below, the PubMed retraction notice for the green coffee extract paper. Unmissable.
Compare PubMed's notice above with the journal's own retraction notice below. The "highly accessed" badge at the top is more prominent than the fact that the article has been retracted!
The retraction process can take a long time and journals have few incentives to move quickly. Thus it can be valuable to also keep abreast of any post-publication peer review — that is, direct commentary presented outside of traditional journal publications. While finding each and every comment that anyone has made can be tricky, the website pubpeer.com aims to serve as a central repository for comments of this sort. Pubpeer allows anonymous comments, which can be useful in allowing whistleblowers to point out image duplications and other evidence of error or misconduct. While not everyone is a fan, we find this site quite useful. We have installed their Firefox and Chrome browser plugins to alert us with an orange bar (illustrated below) any time a paper or citation has been discussed on Pubpeer.
You may have heard the expression "publish or perish". This refers to the fact that many academics they must publish their research on a regular basis to get hired and promoted, to obtain grant funding, and to maintain the esteem of their peers. But publishing isn't so easy; one needs to be able to produce work that is novel and of sufficient quality to clear the hurdles of legitimacy that we have discussed above.
Until recently, anyway. In the past decade, a vast number of essentially fake journals have arisen that cater to academics' needs to publish their work. The publishers of these journals typically charge authors a fee to publish their work. Then, rather than selling subscriptions or producing print copies, these publishers make that work available for free online.
Let's be clear: there's nothing inherently crooked about charging authors to publish. For decades many leading journals relied on both subscription revenue and revenue from page charges levied on their authors. More recently, many journals have eschewed a subscription-based business model in favor of an open access business model in which anyone can read the articles for free online, but (typically) the authors cover the costs by paying article processing charges that range from about a hundred dollars to over five thousand dollars. There are outstanding journals that use this model, including eLife and the PLOS family of journals. (Disclosure: I am currently an editor at the former and previously served as an editor at the latter.)
The problem with this new crop of fake journals is not that they charge the authors, it's that they don't uphold adequate standards of peer review. Many have fake editorial boards and, while promising to conduct proper peer review, will actually publish anything so long as an author is willing to pay the publication fees. This has resulted in some rather silly outcomes. But it also creates at least two serious types of problems. First, unsuspecting authors can be duped by these journals, submitting their work to low-quality venues that often disappear from the internet after a few years. Second, authors can co-conspire with these journals to give their pseudoscientific claims a (weak) veneer of scientific legitimacy.
For example, consider this gem from Astrophysics and Aerospace Technology published by the allegedly predatory OMICS Group:
Abstract: Modern scientists have been searching for clues of some type of extra-terrestrial life. However, ancient scriptures give very interesting and important information and details concerning the total contents of the whole universe. But it is given in a mathematical language, which has long been forgotten by the mankind. As such people have been interpreting those contents, as per their own limited knowledge of Geography and Cosmology. This has naturally, brought about several contradictions and created serious mismatch with the latest scientific findings. The author claims to have deciphered the Code, in which the Riŝis (Saints) had explained the contents of the universe (Lokākāŝa) in terms of living and non-living substances, along with its dependence on time-cycle...
Enough! It goes on and on, but I'm not going to type it out here. If you really have to see the rest, you can do so on your own time.
So how do you tell if a journal is legitimate or predatory? Personally, I feel most comfortable when a journal meets at least one of the following criteria:
But you may not know enough to be sure about any of these criteria, or you may be trying to evaluate a journal that meets none of them. What then? Until recently, scholars often turned to an admittedly controversial "black list" of predatory journals curated by librarian Jeffrey Beall. Unfortunately, Beall recently removed his list, along with the rest of his web presence, from the internet. He has not commented publicly so it is unclear exactly why, though it is widely believed to be a response to threats (hopefully legal rather than extralegal) from black-listed publishers. You can still view Beall's lists of publishers and of stand-alone journals at archive.org, but without updates their value will decline over time. An alternative (and less whack-a-mole) approach is to provide a "white list" of non-predatory publishers. Of these, the Directory of Open Access Journals is the best established. We the instructors are involved in developing FlourishOA, a system for identifying high-quality, high-value open access journals.
In what remains, we consider how to evaluate papers that have not yet been published in a scholarly journal. Some of these tips, such as considering the identity of the authors, are relevant for published work as well.
In fields such as physics and economics, scholars commonly post their work to preprint servers prior to submitting them for publication in a journal. In other areas, including biology and sociology, this practice is not yet common but is quickly gaining traction. This system accelerates the scientific research process by making results available more quickly, and improves the quality of the journal literature because authors can receive feedback from many researchers prior to final publication.
Some preprint servers, including the arXiv (physics, mathematics, and computer science) and bioaRxiv (biology) have various procedures in place to screen every submission by some combination of author endorsement, natural language processing methods (some of which have been derived from our own work and perform surprisingly well), and human screening. Others do not. Screening does even less than peer review to ensure that a paper is correct, but often manages to filter out the seriously crackpot work.
The important message here is that one should not discount a paper or assume that it is unpublishable simply because it appears in a preprint archive instead of a formal journal. In many fields, the norm is to post initially to a preprint archive. Science writers have learned to scan these archives to get the first look at exciting new work, and thus it is not uncommon to see news stories about papers still at the preprint stage. That said, readers should approach preprints with as much or more critical-mindedness than they bring to papers already published in scholarly journals.
Nowadays, researchers probably use Google Scholar more often than commercial databases such as the Clarivate Web of Science or Elsevier's Scopus to find relevant articles. The underlying philosophies of these database are different; Google Scholar attempts to index most everything, where as Web of Science and Scopus only index papers that have been published in journals that they consider worthy of representation.
This difference is both a strength and a weakness of Google Scholar. On the positive side, Google Scholar indexes a much larger range of papers including a great deal of unpublished work, and so one can find items not listed in the other databases. On the negative side, Google Scholar indexes a much larger range of papers including a great deal of unpublished work and so papers returned by Google Scholar have even less by way of quality control.
Indeed, by indexing not only papers on preprint archives but also those on personal home pages, Google Scholar is known to index some pretty silly material.. In our view, being indexed in the Web of Science or Scopus is a reasonable signal of legitimacy, whereas being indexed in Google Scholar tells us next to nothing about the legitimacy of a paper.
Similarly, the appearance of a paper is no longer a good cue that it is legitimate. Prior to 1978, professional-quality typesetting lay out of reach of the common man on the street or the common scientist in her lab. Professional publishers more or less had a monopoly on this aspect of scholarly communication. In 1978, all of this changed. Computer scientist Donald Knuth released TeX, a typesetting system designed to run on a personal computer. With TeX, authors could typeset their own text and equations as well or better than any professional system. TeX was such a remarkable advance that almost forty years later, it remains the most common way of typesetting technical papers and its output quality also remains unmatched by commercial word processors. The last remaining barriers to the use of TeX — a small but non-zero amount of command-line savvy — are disappearing as increasingly good graphical user interfaces are designed and web-based TeX editors such as Overleaf and ShareLaTeX are becoming widely adopted. In 2017, any crackpot can easily write a paper that looks just like a legitimate piece of scholarship, at least so far as typesetting is concerned .
In principle, science is utterly egalitarian. It doesn't matter who has an idea, it only matters whether that idea offers a better representation of nature. In this sense, the identity of a paper's authors should not matter in the least. In practice, however, there is no reason why we should not approach a paper as a good Bayesian would: taking in account all prior information when making judgements. (While we would hope that one's assessment of the ideas within a paper would swamp one's prior, the prior can be particularly important when deciding what to read in the first place.)
This brings us to the issue of who the authors are. Much as it strains against our ideological impulses, we feel that when trying to assess the legitimacy of a paper the identity of the authors provides useful information of several types. These include:
Are the authors well-established? On one hand, we believe that the best ideas in science often come from graduate students and postdocs, and we believe that people pay too much attention to famous names. On the other hand, and as much as it pains us to say this, with all else equal a paper from a researcher with an extensive publication record and strong reputation is somewhat more credible than a paper from authors who have not published other scholarly work.
Part of this is a simple Bayesian calculation. People who have done good work in the past are likely to be doing good work now. But there's also an interesting incentive issue here. An example illustrates. It's hard for an amateur to evaluate the quality of a diamond at a glance. So think about where you'd feel safer buying a diamond: at Tiffany and Co., or from some guy with a reputation of +2 on eBay? Obviously Tiffany and Co. is a better bet, and a big part of this is because their massive brick-and-mortar presence means that they're in it for the long run. They have an enormous amount to lose from trying to pass along garbage. Not so the fellow on eBay, who can simply start a new account tomorrow with little loss of reputational capital. We feel that the same holds for scholars. A senior professor with 100 papers to her credit is doing something very different when she stakes her reputation on a bold new claim than is a researcher in the private sector with no prior publications.
The green coffee paper we have been discussing is a good illustration. The senior author who conducted the actual trial, Mysore V. Nagendran, does not have an appreciable publication record. Other than the paper and related conference presentations, he has published at most one other paper, a decade earlier (and we have not been able to verify that this was not published by a different researcher who happens to share the same name.)
Are the authors experts in the specific area treated in the paper? While many good papers have been written by newcomers to a field, and while are grateful to philosophers and physicists and so forth for taking our work in their areas seriously despite our lack of track record, we feel that all else equal a paper is more likely to be reliable when written by authors with substantial experience in the area.
Do the authors have a vested interest in the results they are publishing? Most researchers feel that papers are less credible when their authors have a direct financial stake in the results reported. Our green coffee paper again provides an example. It turns out that this study was funded by Applied Food Sciences Inc., the company that manufactured the green coffee extract in question. They both paid for the original trials, and hired the two lead authors to rewrite the paper after the original manuscript could not be published. The close involvement of APS in the trial is problematic. It does not stretch the imagination far to imagine that this could have had something to do with the senior author's extensive alterations of the data, or with the two lead authors' failure to rectify serious inconsistencies in the data that they received.
So how do you know whether the authors have a financial stake in the results? One way to tell is simply to look at whether the authors are affiliated with firms that would benefit from the results of a study. An experiment trial showing no deleterious health consequences of a pesticide is substantially less credible if authored by employees of the pesticide manufacturer. Another is to look at the funding section of a study. Some industry funders are able to put undue pressure on researchers to publish only those results that benefit the company.
Especially in biomedicine, journals now require detailed conflict-of-interest (COI) statements for each author, disclosing any such financial relationship be it in the form of corporate research funding, a stake in a related company, consulting agreements, or other associations. Unfortunately, this was not helpful in the green coffee paper. The authors of that paper asserted that they had no conflicts of interest to declare. (Recall that this work was published in a low-quality journal; higher-tier journals probably do a better job of verifying COI statements).
Science is not immune to bullshit. There's bullshit in legitimate papers written by good authors and published in top journals. That kind of bullshit can be tricky to catch, and much of our effort in this course goes toward providing you the skills with which to do so. In this piece, however, we're dealing with something different and easier to detect: papers that superficially look a bit like real science, but do not represent legitimate work. You'll come across this stuff occasionally through Google searches, and the popular press can be fooled on occasion. Learn to identify the signs that a paper is not legitimate science, and as you read, be aware of your scholarly surroundings at all times.