Tag Archives: Research

Conflicts of Interest at Bern’s Inselspital?

In the NZZ, Daniel Gerny and Simon Hehli report about potential conflicts of interest at Bern’s university hospital, the Inselspital.

A cardiologist found evidence of negative health effects of in vitro fertilization, which the Inselspital offers. But unusually, the hospital’s PR department didn’t advertize the findings. The motivation to keep quiet, according to the PR department, was that the findings are of relevance also for other groups at the hospital, and that this would have to be taken into account:

Publiziert wurde die Berner Studie im renommierten Fachmagazin «Journal of the American College of Cardiology». Seit Tagen sind die Untersuchungsergebnisse in aller Munde. … Umso erstaunlicher, dass ausgerechnet Scherrers Arbeitgeber, das Inselspital, der Studie keine grosse Beachtung schenkt: Aktiv wird darüber nicht informiert, eine Medienmitteilung ist nicht vorgesehen. Dabei waren sich die Studienautoren sehr wohl bewusst, welche Resonanz ihre Arbeit auslösen würde. Sie machten die Kommunikationsabteilung des Spitals deshalb gezielt und frühzeitig auf ihre Arbeit aufmerksam. Doch diese winkte kurzerhand ab: Im vorliegenden Fall wolle man keine aktive Medienarbeit betreiben, beschied sie in einer Mail, die der NZZ vorliegt. Als Erklärung für diesen Entscheid führt das Inselspital an die Adresse der Forschenden unter anderem an, «dass Ihre Ergebnisse direkt einen anderen Fachbereich der Insel-Gruppe tangieren. Da wir für die gesamte Gruppe die Kommunikation betreiben, müssen wir auch dies berücksichtigen.»

On Publishing and Cost Benefit Analysis

On his blog, Gilles Saint-Paul comments on the publication process in economics.

Of course I was wrong in all accounts. The publication process in economics is not a publication process, it is a validation process by which we acquire a certain rank in a certain pecking order. Submitting a paper to a journal has nothing to do with research dissemination, it is far more similar to taking an exam or participating in a sports competition. The actual dissemination takes place mostly orally, in seminars and conferences; these seminars and conferences are also important validation events, because they allow authors to signal some of their characteristics that may influence their position in the pecking order, while not being easy to infer from their papers.

Now, when you take an exam as a student, you are graded by your professor, not by a fellow student – who would be a competitor if this exam is actually a contest. …

Yet this is the way our own profession is organized. Each submission is “peer reviewed’, that is, it has to be accepted by anonymous referees who happen to be participating in the same beauty contest as the author(s), most often in the same subcategory. At a minimum, as believers of cost-benefit analysis, we should consider that the journal editors and referees themselves perform a cost-benefit analysis when deciding whether or not to publish a paper. I must say that if I apply such a theory to explain my own experience with acceptances and rejections, I easily get an R2 of 80 %.

BIS Research Review

In an Independent Review of BIS Research, Franklin Allen, Charles Bean and José De Gregorio conclude that

… BIS research clearly ‘punches above its weight’ compared to its central bank peers. Finally, the relative performance of the BIS has clearly improved over the past five years, a tribute to the influence of the previous (Steve Cecchetti) and current (Claudio Borio and Hyun Shin) leadership …

They recommend, among other points:

The research programme should have a more clearly defined long-term focus, be less driven
by short-term needs, and seek to be more holistic in approach.

The internal culture should be more open to challenge and research should avoid focussing
on generating results to support the ‘house view’.

Doubts about Empirical Research

The Economist reports about research by Paul Smaldino and Richard McElreath indicating that studies in psychology, neuroscience and medicine have low statistical power (the probability to correctly reject a null hypothesis). If, nevertheless, almost all published studies contain significant results (i.e., rejections of null hypotheses), then this is suspicious.

Furthermore, Smaldino and McElreath’s research suggests that

the process of replication, by which published results are tested anew, is incapable of correcting the situation no matter how rigorously it is pursued.

With the help of a model of competing research institutes, Smaldino and McElreath simulate how empirical scientific research  progresses. Labs that find more new results also tend to produce more false positives. More careful labs try to rule out false positives but publish less. More “successful” labs are allowed to replicate. As a consequence, less careful labs spread out. Replication—repetition of randomly selected findings—does not stop this process.

poor methods still won—albeit more slowly. This was true in even the most punitive version of the model, in which labs received a penalty 100 times the value of the original “pay-off” for a result that failed to replicate, and replication rates were high (half of all results were subject to replication efforts).

Smaldino and McElreath conclude that “top-performing laboratories will always be those who are able to cut corners”—even in a world with frequent replication. The Economist concludes that

[u]ltimately, therefore, the way to end the proliferation of bad science is not to nag people to behave better, or even to encourage replication, but for universities and funding agencies to stop rewarding researchers who publish copiously over those who publish fewer, but perhaps higher-quality papers.

Research Productivity of Economics PhDs

In an article in the Journal of Economic Perspectives (data appendix), John Conley and Sina Önder argue that

only the top 10–20  percent of a typical graduating class of economics PhD students are likely to accumulate a research record that might lead to tenure at a medium-level research university. … graduating from a top department is neither necessary nor sufficient for becoming a successful research economist. Top researchers come from across the ranks of PhD-granting institutions, and lower-ranked departments produce stars with some regularity, although with lower frequency than the higher-ranked departments. Most of the graduates of even the very highest-ranked departments produce little, if any, published research.

The Economist discussed the article here.

Self-Correcting Research?

The Economist doubts that science is self-correcting as “many more dodgy results are published than are subsequently corrected or withdrawn.”

Referees do a bad job. Publishing pressure leads researchers to publish their (correct and incorrect) results multiple times. Replication studies are hard and thankless. And everyone seems to be getting the statistics wrong.

A researcher suffers from a type I error when she incorrectly rejects an hypothesis although it is true (false positive); and from a type II error when she incorrectly accepts an hypothesis although it is wrong (false negative). A good testing procedure minimises the type II error given a specified type I error that is, it maximises the power of the test. While employing a test with a power of 80% is considered good practice actual hypothesis testing often suffers from much lower power. As a consequence, many or even a majority of apparent “results” identified by a test might be wrong while most of the “non-results” are correctly identified. Quoting from the article:

… consider 1,000 hypotheses being tested of which just 100 are true (see chart). Studies with a power of 0.8 will find 80 of them, missing 20 because of false negatives. Of the 900 hypotheses that are wrong, 5%—that is, 45 of them—will look right because of type I errors. Add the false positives to the 80 true positives and you have 125 positive results, fully a third of which are specious. If you dropped the statistical power from 0.8 to 0.4, which would seem realistic for many fields, you would still have 45 false positives but only 40 true positives. More than half your positive results would be wrong.