Wednesday, March 2, 2011

Big science: The cancer genome challenge

[Courtesy: NatureNews]
Databases could soon be flooded with genome sequences from 25,000 tumours. Heidi Ledford looks at the obstacles researchers face as they search for meaning in the data.
When it was first discovered, in 2006, in a study of 35 colorectal cancers1, the mutation in the gene IDH1 seemed to have little consequence. It appeared in only one of the tumours sampled, and later analyses of some 300 more have revealed no additional mutations in the gene. The mutation changed only one letter of IDH1, which encodes isocitrate dehydrogenase, a lowly housekeeping enzyme involved in metabolism. And there were plenty of other mutations to study in the 13,000 genes sequenced from each sample. "Nobody would have expected IDH1 to be important in cancer," says Victor Velculescu, a researcher at the Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins University in Baltimore, Maryland, who had contributed to the study.
But as efforts to sequence tumour DNA expanded, the IDH1 mutation surfaced again: in 12% of samples of a type of brain cancer called glioblastoma multiforme2, then in 8% of acute myeloid leukaemia samples3. Structural studies showed that the mutation changed the activity of isocitrate dehydrogenase, causing a cancer-promoting metabolite to accumulate in cells4. And at least one pharmaceutical company — Agios Pharmaceuticals in Cambridge, Massachusetts — is already hunting for a drug to stop the process.
Four years after the initial discovery, ask a researcher in the field why cancer genome projects are worthwhile, and many will probably bring up the IDH1 mutation, the inconspicuous needle pulled from a veritable haystack of cancer-associated mutations thanks to high-powered genome sequencing. In the past two years, labs around the world have teamed up to sequence the DNA from thousands of tumours along with healthy cells from the same individuals. Roughly 75 cancer genomes have been sequenced to some extent and published; researchers expect to have several hundred completed sequences by the end of the year.

The efforts are certainly creating bigger haystacks. Comparing the gene sequence of any tumour to that of a normal cell reveals dozens of single-letter changes, or point mutations, along with repeated, deleted, swapped or inverted sequences (see 'Genomes at a glance'). "The difficulty," says Bert Vogelstein, a cancer researcher at the Ludwig Center for Cancer Genetics and Therapeutics at Johns Hopkins, "is going to be figuring out how to use the information to help people rather than to just catalogue lots and lots of mutations". No matter how similar they might look clinically, most tumours seem to differ genetically. This stymies efforts to distinguish the mutations that cause and accelerate cancers — the drivers — from the accidental by-products of a cancer's growth and thwarted DNA-repair mechanisms — the passengers. Researchers can look for mutations that pop up again and again, or they can identify key pathways that are mutated at different points. But the projects are providing more questions than answers. "Once you take the few obvious mutations at the top of the list, how do you make sense of the rest of them?" asks Will Parsons, a paediatric oncologist at Baylor College of Medicine in Houston, Texas. "How do you decide which are worthy of follow up and functional analysis? That's going to be the hard part."

Drivers wanted

Because cancer is a disease so intimately associated with genetic mutation, many thought it would be amenable to genomic exploration through initiatives based on the collaborative model of the Human Genome Project. The International Cancer Genome Consortium (ICGC), formed in 2008, is coordinating efforts to sequence 500 tumours from each of 50 cancers. Together, these projects will cost in the order of US$1 billion. Eleven countries have already signed on to cover more than 20 cancers (see map). The ICGC includes two older, large-scale projects: the Cancer Genome Project, at the Wellcome Trust Sanger Institute near Cambridge, UK, and the US National Institutes of Health's Cancer Genome Atlas (TCGA). The Cancer Genome Project has churned out more than 100 partial genomes and roughly 15 whole genomes in various stages of completion, and intends to tackle 2,000–3,000 more over the next 5–7 years. TCGA, meanwhile, wrapped up a three-year, three-cancer pilot project last year, then launched a full-scale endeavour to sequence up to 500 tumours from each of more than 20 cancers over the next five years.
Although the groups collaborate, TCGA has not yet been able to fully join the ICGC owing to differences in privacy regulations governing access to genome data. For now, members of both consortia are sequencing a subset of tumour samples from each cancer type — around 100 — and will follow this by sequencing promising areas in the remaining 400. That's useful, says Joe Gray, a cancer researcher at Lawrence Berkeley National Laboratory in California, but it's just a start. "In the early days, I thought that doing a few hundred tumours would probably be sufficient," he says. "Even at the level of 1,000 samples, I think we're probably not going to have the statistics we want."
“Even at the level of 1,000 samples, I think we're probably not going to have the statistics we want.”
What bigger numbers could provide is more driver mutations like the one in IDH1. These could, researchers argue, provide the clearest route to developing new cancer therapies. Many scientists have looked for mutations that occur repeatedly in a given type of tumour. "If there are lots and lots of abnormalities of a particular gene, the most likely explanation is often that those mutations have been selected for by the cancers and therefore they are cancer-causing," says Michael Stratton, who co-directs the Cancer Genome Project. This approach has worked well in some cancers. For example, with a frequency of 12%, it is clear that the IDH1 mutation is a driver in glioblastoma. Such searches should be fruitful for cancers that have fewer mutations overall. The full genome sequence of acute myeloid leukaemia cells yielded just ten mutations in protein-coding genes, eight of which had not previously been linked with cancer5.
Other cancers have proved more challenging. IDH1 was overlooked at first, on the basis of the colorectal cancer data alone. It was not until the search was expanded to other cancers that its importance was revealed. Moreover, some mutations shown to be drivers haven't turned up as often as expected. "It's very clear, now that all the genes have been sequenced in this many tumours, you have drivers that are mutated at very low frequency, in less than 1% of the cancers," says Vogelstein. To find these low-frequency drivers, researchers are sampling heavily — sequencing 500 samples per cancer should reveal mutations that are present in as few as 3% of the tumours. Although they may not contribute to the majority of tumours, they may still have important biological lessons, says Stratton. "We need to know about these to understand the overall genomic landscape of cancer."
Another popular approach has been to look for mutations that cluster in a pathway, a group of genes that work together to carry out a specific process, even if the mutations strike it at different points. In an analysis of 24 pancreatic cancers6, for instance, Vogelstein and his colleagues identified 12 signalling pathways that had been altered. Nevertheless, Vogelstein cautions that this approach is not easy to pursue. Many pathways overlap, and their boundaries are unclear. And because many have been defined using data from different animals or cell types, they do not always match what's found in a specific human tissue. "When you layer on top of that the fact that the cancer cell is not wired the same as a normal cell, that raises even further difficulties," says Vogelstein.

How much is enough?

Separating drivers from passengers will become even more difficult as researchers move towards sequencing entire tumour genomes. To date, only a fraction of the existing cancer genomes are complete sequences. To keep costs low, most have covered only the exome, the 1.5% of the genome that directly codes for protein and is therefore the easiest to interpret. Assigning importance to a mutation found in the murky non-protein-coding depths of the genome will be more challenging, especially given that scientists don't yet know what function — if any — most of these regions usually serve. The vast majority of mutations fall here. The full genome sequence of a lung cancer cell line, for example, yielded 22,910 point mutations, only 134 of which were in protein-coding regions (see graphic, right)7. Nevertheless, finding them is worth the cost and effort, argues Stratton. "It could be that none of those mutations pertain to the causation of cancer," he says. "But it equally could be that some do. We'll never find out unless we systematically investigate."
Not everyone agrees. Some researchers argue that the costs of cancer-genome projects currently outweigh the benefits. Prices are poised to drop dramatically in the next few years as a new generation of sequencing machines comes online, says Ari Melnick, a cancer researcher at Weill Cornell Medical College in New York. "Why not wait for that?" he asks. In the meantime there are lower-hanging fruit to pick, says Stephen Elledge, a geneticist at Harvard Medical School in Boston, Massachusetts. Mutations that affect how many copies of a gene are found in a genome, he argues, are cheaper to assess and provide a more intuitive insight into biological processes. "If you delete something, you can turn a pathway off very efficiently," he says. "And if you amplify something, you can increase flow through the pathway. Making point mutations in genes to activate them is a little dicier."
Changes in gene copy number can be detected using fast, relatively inexpensive array-based technologies, but sequencing can provide a higher-resolution snapshot of these regions, says Elaine Mardis, a sequencing specialist at Washington University in St Louis, Missouri. Sequencing can enable researchers to map the boundaries of insertions and duplications with more precision and to catch tiny duplications or deletions that might have gone undetected by an array. Mardis, along with her colleague Richard Wilson and others, used sequencing to detect overlapping deletions in a breast cancer that had spread to other parts of the body (see page 999)8. The deletions spanned the region containing CTNNA1, a gene thought to suppress the spread, or metastasis, of cancer.
Meanwhile, cancer genomics is spreading out from under the large, centralized projects. For example, a $65-million, three-year paediatric-cancer genome project headed by researchers at St Jude Children's Research Hospital in Memphis, Tennessee, and Washington University aims to sequence 600 tumours. And more small projects seem poised to pop up. "Pretty much any cancer centre with any interest in the genomics of cancer is now buying these sequencers and using them," says Sam Aparicio, a cancer researcher at the University of British Columbia in Vancouver, Canada.
Nature Physics Insight: Physics and the Cell

Part of the reason that cancer-genome proponents don't want to wait for sequencing costs to drop is that the real work starts after the sequencing is over. As Velculescu puts it, "Ultimately it's going to take good old-fashioned biology and experimental analyses to really determine what these mutations are doing." With this in mind, the US National Cancer Institute established two 2-year projects in September last year to develop high-throughput methods to test how the mutations identified by the TCGA pilot project affect cell function. The two centres — one at the Dana-Farber Cancer Center in Boston, and another at Cold Spring Harbor Laboratory in New York — aim to systematize the way that researchers pull other needles like the IDH1 mutation from the cancer-genomes haystack and make sense of them. The Boston team will systematically amplify and reduce the expression of genes of interest in cell cultures, and the Cold Spring Harbor centre will study cancer-associated mutations using tumours transplanted into mice.
“It's going to take good old-fashioned biology to really determine what these mutations are doing.”
In addition, large-scale projects are being run in parallel with the cancer-sequencing consortia to assess the effects of deleting each gene in the mouse genome, enabling researchers to learn more about the normal function of genes that are mutated in cancer. Sequencing is all very well, researchers have realized, but it won't be enough. "Some people say statistics should get us all the drivers that are worthwhile," says Lynda Chin, an investigator with TCGA at Harvard Medical School. "I don't agree with that. At the end of the day, we need these functional studies to prioritize the list of potential cancer-relevant candidates." 
See also News and Views, page 989.

Studies spot a gene that allows some cancer cells to evade drugs such as Taxol.

Ovarian cancer cells can survive chemotherapy drugs if they have defects in the FBW7 gene.STEVE GSCHMEISSNER/SCIENCE PHOTO LIBRARY
Potent chemotherapy drugs such as Taxol (paclitaxel) prompt cancer cells to self-destruct — but some tumours stubbornly survive the treatment.
Two studies have now independently pinpointed a gene that lies behind at least part of this resistance1,2. The discovery could help oncologists predict which patients are likely to respond to Taxol and drugs with similar actions, and which may not. It also flags up new targets for cancer therapy.
Taxol belongs to a class of chemotherapy drugs that work by binding to tubulin, a key protein in the network of filaments that maintains a cell's structure. Cells hit with anti-tubulin drugs "try to divide but they can't", says Ingrid Wertz, a molecular biologist at biotechnology company Genentech, headquartered in San Francisco, California.
Cancer cells that respond to Taxol eventually die. But other cells resist the treatment. In 2007, Wertz and her colleagues began asking why. The team found that, in responding cells, levels of one protein in particular, MCL1 – part of a family of proteins already known to affect a cell's life-cycle – were markedly lower immediately after treatment with Taxol (and with another anti-tubulin drug, vincristine). Further experiments showed that a known cancer-fighting protein, FBW7, was destroying MCL1.
Defects in the FBW7 gene had already been linked to a variety of cancers, including breast and colon. Wertz reasoned that its absence might in particular lead to high levels of MCL1, and explain why some cancer cells don't die when treated with anti-tubulin drugs.
Sure enough, the researchers observed that ovary and colon tumour cells with mutations in FBW7 had higher levels of the MCL1 protein and were more resistant to anti-tubulin drugs than cells with working copies of the gene.

Different route, same outcome

Meanwhile, Wenyi Wei, a molecular biologist at Beth Israel Deaconess Medical Center in Boston, Massachusetts, and his colleagues were also studying FBW7's effects. Wei's group was focusing on a particular disease: T-cell acute lymphoblastic leukaemia (T-ALL), in which an estimated 30% of all cases have cells with FBW7 defects. These cells had high levels of other proteins that normally induce cell death, and yet they did not die. Wei's research into why that was led him to the same explanation: without the FBW7 protein, the cells did not break down MCL1, a necessary step for their death. "We went about it in opposite ways but ended up at the same conclusion," Wertz says, "which was really cool."
Wei and his colleagues also found a link to drug resistance. They exposed T-ALL cells to ABT-737, an experimental drug discovered by US healthcare firm Abbott, based in Abbott Park, Illinois (a newer version, ABT-263, is now in phase II clinical trials). This drug does not attack tubulin, but kills by blocking other proteins that promote cell survival. Again, cells with an FBW7 defect, and high levels of MCL1, are less sensitive to the drug. But the researchers found a way to solve this problem: by treating the cells with an agent called sorafenib, which lowered MCL1 levels and restored cells' sensitivity to the experimental drug.
The studies suggest that oncologists may be able to tailor their treatments based on whether or not patients have a defective FBW7 gene in tumour cells. "I think it has potential implications for any cancer in which these anti-microtubule agents are used," Wertz says.
Still, there are other ways to resist Taxol and similar drugs. Cancer cells may contain mutated tubulin, meaning anti-tubulin drugs can't bind to them in the first place. Or they may contain extra protein pumps that enable cells to quickly eliminate chemotherapy drugs. Anthony Letai, an oncologist at the Dana-Farber Cancer Institute in Boston, says that the importance of the MCL1 pathway in conferring drug resistance probably varies depending on the type of cancer.
"As with any study, you don't know how generalizable it is beyond these cell lines that they study," says Letai. "There's probably plenty of cell lines in which these effects are not observable." The trick, he adds, will be to figure out which cancers follow this model.

Bruce Clurman, an oncologist and molecular biologist at the Fred Hutchinson Cancer Research Center in Seattle, Washington, says the findings are exciting and provocative, but preliminary. He notes that FBW7 targets a number of proteins for destruction, not just MCL1. "When you disrupt FBW7, it's hard to know which of these downstream targets are playing what role in the development of cancer." These studies focus on FBW7's role in regulating MCL1, but "it's certainly far from the whole story", he says.
Hayley McDaid, a cancer biologist at Albert Einstein College of Medicine in New York, suggests looking at archived specimens from cancer patients treated with Taxol-like drugs. If Wertz's model holds, the researchers should find a correlation between the presence of FBW7 and response to Taxol. "We need to go in and actually do some sequence analysis on those specimens," she says.