long read sequencing


Cardiologists Love Genomics: Euan Ashley, Stanford

Euan Ashley is one of the big names in genomic medicine that has been missing from our guest list. We’re happy to correct that today.

In 2010, he led the team who did the first clinical interpretation of a human genome--that of his Stanford colleague, Steve Quake. Since then Euan, an MD PhD, has been driving to make the use of new genomic tools and discoveries a routine part of medicine at Stanford, particularly in his own discipline of cardiology.

A regular speaker on the conference circuit, Euan titles his talks, "Genomic Medicine Is Here."

"There were these one off examples of great stories that captured everyone’s imagination,” he says at the outset "but somewhere in there, what happened is it just became routine. And we started sending exome and genome sequences on patients and using that information to help find a cause, and in some cases, treatment for their condition. We were all waiting for it to happen, but it just happened under our noses.”

At the same time, Euan acknowledges that he “loses sleep at night” over “dark corners of the genome.” What are these dark corners? What recent findings were made by new long read sequencing? How has genomics impacted cardiology?

We begin with the question, if genomic medicine is here, why are there still so many skeptics?

Join us in our first interview with one of the few jazz saxophonists in our field, someone who knew he wanted to be a doctor at age four but wasn’t inspired by science--that is, until a high school teacher handed him a copy of Richard Dawkins' “The Selfish Gene” after class.

People Told Us It Was Impossible: UCSC’s Mark Akeson on Nanopore Sequencing

Mark Akeson has been working on nanopore sequencing at UC Santa Cruz’s biophysics lab for twenty years. Up until the past few years with the launch of Oxford Nanopore’s sequencers, that work was mostly the methodical toil of the quiet inventor.

Today it is quite ordinary to see a sequencer the size of your wallet being taken out into the field for DNA work. But for years, the naysayers dominated.

“Back in the day, the skeptics outnumbered the proponents 99 to 1,” Mark says in today’s show.

In his beginning-of-the-year blog, NIH Director, Francis Collins, called nanopore sequencing one of the four breakthroughs of 2016. And the NIH deserves some credit.   Mark says they were constant in their funding and belief in the technology.

With the success of nanopore sequencing technology has come legal battles to secure the IP.   Both Illumina and PacBio have sued Oxford Nanopore—the Illumina suit is now settled. And at the end of last month, Akeson’s lab (meaning the University of California) sued Genia, claiming that they owned the patents for Genia’s technology.  Genia was founded in 2009 and we have interviewed them several times since 2011.

“There's the old adage about once something succeeds, there’s all sorts of people who claim to have invented it,” says Mark.  

So what’s next for Mark? Is he on board the “long read train?” How much more can sequencing improve?

 

When Long Reads are Double the Price of Short Reads, Short Reads Are Dead, Says Evan Eichler

Each year at this time, sequencing tools leader, Illumina, generates another round of sequencing buzz in the industry, this year by announcing the $100 genome is around the corner with their latest boxes. But more and more, people are asking just what they will get with that $100. Indeed, what do they get today with a $1,500 genome?

Illumina sells short read sequencing technology which is unable to characterize much of the human genome, particularly complex regions which are responsible for many of the known and unknown diseases.

Today’s guest has made his career studying structural variation of the genome. He’s done it with the rapidly improving long read sequencing technology, mostly on instruments produced by Pacific Biosciences. He says researchers have been seduced by the ability to sequence thousands and tens of thousands of genomes as opposed to understanding five or ten genomes really well.

Evan Eichler is a professor of genomics at the University of Washington and first made his name known back with the original Human Genome Project. In the final days of the project, he was brought into the NIH to analyze the genome for structural variation repeats. Neither the private Venter enterprise nor the public attempt had the ability to see them at the time, and with what Evan calls his “young, stupid naivety," he waded into the project. He was able to compare data from the two groups without getting too caught up in the politics and ended up making an important contribution to the final output. Today Evan has established himself so well in the structural variation space that it is said no project into structural variation can be conceived without him.

“Work that we have done over the past couple years has shown that if you apply a new sequencing technology like long reads, you basically uncover 90% of the structural variation that is missed by short read sequencing technology.”

That’s a big number.

“That is a big number,” says Evan, “so the question is, how important are structural variations? That’s open to debate.”

Evan says there is data which shows that structural variant level changes are likely to be more impactful than those of single nucleotide variants (SNVs). He compares SNVs to little tremors and structural changes to earthquakes when it comes to regulating the genome.

As with his mentor, Jim Lupski, (featured on the program here), Evan is adamant that we must stop using short read technology and aligning to a reference genome. Rather, he says, we must get to the place where we are doing de novo assembly of each genome. We can do that in the research setting now, but we must do that clinically as well.

“If we’re still aligning sequences to a reference genome, and that’s our only way for understanding genetic variation ten years from now, clinically we’ve failed. What we need to think about is how to do this right, and that means understanding all the variation from stem to stern in these genomes."

Why Diversity Is the Only Path Forward: Sarah Tishkoff on African Genomics

Are you lactose tolerant? If you’re of Northern European ancestry this is because of a stretch of DNA in a gene enhancer that developed some 9,000 years ago. That's the same time Northern Europeans began domesticating cattle for milk. If you’re of African ancestry, you may have one of three mutations which appeared independently of the European mutation--and of each other--about 6,000 years ago, again when dairying began.

The genetics around lactose tolerance are a great example of how diverse human populations evolved and how this diversity impacts our health. While many in our field are feeling chagrin at not being able to unlock more secrets in our biology that will lead to medical breakthroughs, some leading researchers are pointing to the need for more diversity in our genomic databases, with a particular emphasis on structural variation.

Sarah Tishkoff began studying African genetics back in graduate school on some cell lines that had been collected and started years before. It was at a conference in Cape Town, South Africa with other geneticists and archeologists and members of the local population where she was asked a question that began her career in Africa. Why are the populations up in Tanzania--those people who speak with clicks--so different from the people not far to the south? Sarah went to Tanzania to find out.

“I had no idea what I was doing at the time. I went to Tanzania and just did it. It was quite an experience going as a woman and leading a team of Africans who were just not used to working with a female leader.”

Since then, the dramatic improvement of sequencing technology has allowed Sarah and her team to do some groundbreaking genetic work, much of which has medical implications. For example, her research into the G6PD gene has shown that for certain African populations common malaria drugs can be toxic.

Because Africans are more genetically diverse and have the oldest genetic lineage, African genetics plays an important part in all human genetics research. It's important that our databases include this diversity. Sarah says the recent work to improve the human reference genome is “a great start” but there’s much more to be done. The three African genomes we pointed out in a recent program, she says, are actually from a common regional ancestor. They only reflect a fraction of the African diversity.

Sarah agrees with those in our field lately who have observed that there are still many mysteries in the genome which have not been unlocked because we’re missing important structural variants.

“I believe that some of these structural variants are going to be functionally super important. They’re going to impact normal variation and disease risk. If we had a more diverse set of reference genomes, then that would be great. People could then go ahead and use short read sequencing and map it back to all these diverse reference genomes. And that’s going to help people in terms of personalized medicine."

Reference Genome Making Major Strides in Ethnic Diversity, Says Valerie Schneider, NCBI

A couple months back, we reported on a study showing that genetic tests for an inherited heart disorder were more likely to come back with false positive results for black Americans than for whites. The study provoked many in our industry to urge scientists to incorporate more ethnic diversity in their studies. So far, biology has been too Eurocentric—the databases are implicitly racist, they argue.

Perhaps no dataset for human genomics is referenced more than the human reference genome, or the GRCh38. This is the "Rosetta Stone” of genomics used by scientists and clinicians everywhere who are assembling and studying genomes. Valerie Schneider is a scientist at the NCBI who works everyday on the GRCh38. She says major strides--enabled in part by better sequencing technologies--have been made lately to add diversity to the GRCh38 and to create other reference genomes for various populations around the globe.

The populations represented with these new projects include a Han genome, a Puerto Rican, a Yoruban, a Columbian, a Gambian, a Luhya, a Vietnamese, and one or two more Europeans.

“The sequence from these genomes is planned for correcting errors and adding new "alt loci" to the reference genome. But these new assemblies are also intended to stand on their own as complements to the reference,” says Valerie.

Valerie reminds us that it’s still early days in genomics. There’s so much diversity in the human population that her team is not sure whether having a single reference for each of these ethnic groups will be sufficient.

With more reference genomes comes the challenge of how best to compare and visualize them. There is a major need for tools that can show large nests of sequence as opposed to a linear reference, she says in today’s interview.

What is Valerie's take on the term “reference quality genomes”, and how will a better reference genome improve precision medicine?

March 2016 with Nathan and Laura: Genomic Jenga and the Creator, the Anti-Abortion Lobby and Genetic Testing, and Theranos, Again

Which company offers the gold standard of sequencing? Nathan starts us out with a metaphor to compare linked reads with real long reads. Then it’s on to this month’s “knockout paper” that moves us yet further from a deterministic view of genetics. Or is this genomic Jenga part of the “proper design of the Creator”? Laura links a new Indiana law banning abortion due to chromosomal abnormalities such as Down Syndrome to a larger effort by the anti-abortion lobby to go after all genetic testing. Theranos plays the Donald Trump of our industry.

It’s the end of March and time to look back with Nathan Pearson of the New York Genome Center and Laura Hercher of Sarah Laurence College.

How Good are Linked Reads? Serge Saxonov, 10X Genomics

When 10X Genomics launched their GemCode sequencing instrument at last year’s AGBT conference, what they offered seemed too good to be true. 10X was promising researchers a machine that could generate long reads using Illumina’s short read technology at a price lower than what PacBio could offer with their “real” long read instruments. A year earlier, Illumina had announced they were buying Moleculo, a company that promised to offer long read data out of the short reads. But good data with the Moleculo platform failed to materialize.

10X Genomics hasn’t had that problem of Moleculo, and was in fact declared the “winner” at AGBT this year when they presented de novo human data.

Today, for the first time, the CEO of 10X, Serge Saxonov, joins us to talk about their technology and the company’s stellar rise.

The question everyone wants answered from Serge is how well the 10X linked reads stand up to so called “real” long reads. PacBio has spent years co-discovering with their customers applications where their long reads provide significant advantage over short reads, at a price. And even though PacBio released a cheaper-faster-better machine, the Sequel, late last year, some researchers have been wondering whether 10X might come through and "clean house" with their inexpensive system?

“Now you can get the information that people were hoping to access in maybe five or ten years--you can get it now. And in fact you don’t need to make a tremendous new investment and change your workflow radically,” says Serge.

While 10X is enabling Illumina customers to generate long reads, are there still limitations of the short read machines that can’t be overcome?

Serge and 10X have already launched a second system, the Chromium, which offers single cell analysis. How big is the single cell market, and what are Serge’s thoughts on the future of sequencing?

A Home Run on the First Hit: PacBio’s Jonas Korlach

Jonas Korlach is a natural storyteller—a rare trait in a scientist who is more comfortable presenting data than talking of himself. Jonas is the co-inventor of PacBio’s SMRT (single molecule, real time) sequencing, and we wanted to hear from him directly how it all got started, and also when the team realized that they had something big with long reads and close to 100X coverage. How many of us can boast of hitting it out of the park on our first try?

Frontiers of Sequencing: Putting Long Reads and Graph Assemblies to Work

OK, so we get it. Long read sequencing technology is cool. But how cool? Is it another great player on the field, or does it change the game altogether? 

The Mike Schatz lab at Cold Spring Harbor is well know for de novo genome assemblies and their work on structural variation in cancer genomes, so we were curious to hear how long reads have impacted their work. In todays show, lab leader, Mike Schatz, and doctorate student, Maria Nattestad tell of two new projects that include the de novo assembly of a very difficult but important flatworm genome and, secondly, making better variant calls for oncogenes such as HER2.

In the case of the flatworm, Mike says that the move to using PacBio’s long reads improved the assembly by more than a 100 times. That means the difference of looking at a super high resolution picture versus a fuzzy, blurry one, he says. With her work on cancer cell lines, Maria is seeing variants that just weren’t there with short reads. Will her work translate to lower false positive rates for HER2 in clinical diagnostics?

What will be the major headline for sequencing and informatics in 2016?

Mike says we’ll see many more reference genomes done, that the term “reference genome” itself is changing as we go from the one standard reference genome to multiple reference genomes representing the broader population. These new reference genomes are pushing bioinformaticians to come up with new ways to visualize and compare the genomes. Maria details her work into using “graph” assemblies as opposed to the linear approach made popular by the Genome Browser. She says that already a new generation of informaticians are rethinking genome visualization using graph assemblies. (Included below is an image representing her work.)

Neither mentioned it, so we ask at the end, what about Oxford Nanopore’s tech?

 

(The spectral karyotype of the Her2-amplified breast cancer cell line SK-BR-3. The original chromosomes are different colors, so this genome is a complex mixture of various chromosomes. The total number of chromosomes has also jumped from 46 to 80, and there is approximately twice as much DNA as in a normal human genome. Maria Nattestad and Mike Schatz are studying this genome to see how oncogenes like Her2 became amplified while all these changes took place in the big picture of the genome.)

Knowing More about What We Don’t Know: John McPherson on Cancer Genomics

More than with any other major disease, the understanding and treatment of cancer is being transformed by genomics. And these are early days.

John McPherson has been involved in sequencing since the original Human Genome Project. He now directs the Genome Technologies Program at the Ontario Institute for Cancer Research. John chaired a panel on cancer genomics at the recent AGBT, or Advances in Genome Biology and Technology conference, and shares his thoughts on this year's meeting.

Like many others, John is excited about the new possibilities gained by long read sequencing, particularly in showing structural variations of various cancers.

We ask John which platforms he likes, and most importantly--in this day with increasing sequencing instrument options--how he decides how much to spend on sequencing to answer a specific question.

"Our goal is to be as accurate as we can," he says. "For single nucleotide variants (SNPs), we see about a 93-95% verification rate. And we’re pretty happy with that. The question becomes how many samples you do, and not what you do to a sample. Depending what question you’re asking, the number of samples affects your power overall.”

John works in Ontario. We ask him about the state of clinical genomics in Canada, a country with a single payer system.



-->