Want better science? Quit hoarding data, genetics researchers say


When Andrea Downing was 25, she got screened for the BRCA genes known to be associated with a variety of cancers, including breast cancer. Both her great-grandmother and grandmother had been diagnosed with the disease, so the results were no surprise: Downing carried the BRCA1 mutation in her genes. She learned there was a 87 percent chance she would get breast cancer during her lifetime, and 60 percent chance she would get ovarian cancer.

The revelation brought with it a dizzying range of choices. Should she get a mastectomy before the cancer showed? Should she choose to have her ovaries removed? Could she wait until after she had kids?

For the first several years after her diagnosis, Downing sought out support groups, then began booking appointments with researchers and examining the latest literature. “I’m a little different from your usual patient who tested positive,” Downing said. “I wanted to go beyond to challenge myself and understand the science of cancer.”

Then, in 2013, she chanced on was ClinVar, a research database funded by the National Institute of Health that acts as a kind of Wikipedia to catalogue scientific research on mutations in genes. It gave her a roadmap for the research associated with her variant, called C16G.

Downing typed in the letters and numbers of her mutation, and the website spit out a list of companies and labs that have studied her variant. Though much of that information was technical, she said, “the things I do understand about it are very empowering. It’s a starting point to answering questions I don’t know.”

When the database first launched, the idea was that the single repository would present a unified picture of a variant, drawing from all available research that was publicly shared by companies and research labs.

Two years later, the team behind the operation has published a progress report of sorts in the New England Journal of Medicine. They argue that this shared approach is working — doctors and researchers are using the database — and they are advocating for more companies and groups to join the effort to reach a more comprehensive understanding of the variants in disease genes. In particular, they’re challenging companies to be more open with their data, instead of keeping it to themselves.

“Healthy competition among isolated entities is no longer sufficient to drive our understanding of human variation, and patient care may be compromised when data are not shared,” the authors write. They add that if doctors are to benefit from what’s known about human genetics, “large scale collaborative efforts” are the only path to follow.

If private companies or single labs followed their own interpretation of variants, they’re likely to get it wrong, said Heidi Rehm, a physician at Brigham and Women’s Hospital who was one of the early architects of the database.

The NEJM study found that at least 17 percent of variants in the database were interpreted differently across laboratories — things like how long it would take for a disease to appear, or how likely a patient was to develop a disease — which meant people with the same variant for the same disease were getting different information.

Rehm said that the highest volume of data was submitted by clinical laboratories. Because knowledge about a particular variant is constantly evolving, the next step in the ClinVar setup is to include a system to indicate how far along the information is.

Nine years after her screening, Downing has undergone a mastectomy, gotten married, and moved to Oregon. She is expecting her first child in the next two weeks. “My data is not an abstraction to me, it represents suffering in my family,” she said. But sharing it broadly and widely means that others in her position, including her children, can benefit.

Nidhi Subbaraman writes about science and research. Email her at [email protected]
Follow Nidhi on Twitter - Facebook