On 6 May, the US stock market experienced a peculiar 'minicrash' when what seems to be a mishandled trading order temporary sent stocks plummeting. The dramatic episode on Wall Street underscores how small errors can substantially upset data-heavy systems, and deciphering the error afterward can be a seemingly impossible task.

The same holds true in the realm of increasingly data-intense biotechnology research—as is being made clear by concerns about data errors made by researchers at Duke University in Durham, North Carolina.

Four years ago, researchers from Duke published what was hailed as a ground-breaking paper in Nature Medicine (Nat. Med. 12, 1294–1300, 2006). Using high-throughput microarray technology, they had examined how tens of thousands of genes might affect a patient's reaction to various combinations of chemotherapy drugs.

The study meant that cancer patients could begin to be prescribed chemotherapy regimes that would work best for their genetic predisposition, a major step forward for personalized medicine. The finding was so promising that a team from the Houston-based M.D. Anderson Cancer Center, specializing in what they call 'forensic bioinformatics', began to try to recreate the Duke team's data in hopes of doing similar work at their institution, says Keith Baggerly, who conducted the reexamination work with fellow researcher Kevin Coombes (Nat. Med. 13, 1276–1277, 2007). To date, the team claims that they've spent more than 15,000 hours of work on the project.

The M.D. Anderson team found inconsistencies in the paper's findings, which was later determined to be the result of data-handling mistakes on the part of the Duke team. For example, in one instance, a label column in an Excel file was accidentally shifted, resulting in the entire set of data being mislabeled by one position. The Duke team openly admitted to making errors, including the 'one-off' mistake and other mislabeling errors, saying that these glitches did not affect the findings. The Nature Medicine paper was corrected online.

In December, Baggerly and Coombes published their analysis of the Nature Medicine paper and others by the Duke team. Baggerly and Coombes alleged that patients enrolled in clinical trials based on the Duke work might actually be receiving treatments that, according to their reanalysis, would be less effective for their genetic predisposition than indicated (Ann. Appl. Stat. 3, 1309–1335, 2009).

Duke administrators quickly suspended three of these clinical trials and invited an outside group to examine their team's work. In January of this year, those trials resumed along with a note from the administrators assuring that the independent review had found no indication that the data errors had affected the overall findings or would put patients at risk.

However, Duke administrators refused to release the review. It was not until the publication The Cancer Letter, working with Baggerly, filed a Freedom of Information Act request to the National Cancer Institute that the document became public in mid-May—but much of the data is redacted.

The Duke researchers, Anil Potti and Joseph Nevins, say that the redaction is necessary to prevent the release of unpublished data and that the review's findings are clearly spelled out.

“We made mistakes—the types of mistakes that, unfortunately, happen in research,” Potti says. “We admit that, and we worked hard to make sure that they were corrected and did not affect our findings. We're doing everything within reason to make that clear. To make some sort of leap to the accusation that patients are being put at risk is simply not fair—most of all, to the patients who would be denied better treatment.”

However, Baggerly says that the redacted version does not reveal enough information to calm concerns that lingering data problems are potentially putting patients at risk.

“This idea that they have to be secretive to protect their work isn't going to benefit them,” he says. “They have to break with this idea and show everything—which, admittedly, is hard, since that's not how things are usually done.”

About the same time that the redacted report was released, officials at the National Cancer Institute Cancer Therapy Evaluation Program eliminated from a phase 3 clinical trial the use of a biomarker test based on different work from the Duke team. In a statement to The Cancer Letter, the program's director said that they were unable to confirm the test's “utility.”