Major Sequencing Projects Should Be Done with Long Reads, Says Dan Geraghty


Dan Geraghty, Researcher, Fred Hutchinson Cancer Center; CEO, Scisco Genetics Bio and Contact Info

Listen (4:43) Unable so far to find causal linkages in MHC region of the genome

Listen (4:43) Illumina vs. PacBio

Listen (9:27) Major sequencing projects should be done with long reads

Listen (9:32) Is the message about long reads getting out there?

Listen (3:15) What projects will you pursue with long read technology?

Dan Geraghty has a message for anyone looking for genetic causes of disease.

A researcher at the Fred Hutchinson Cancer Center, Dan has been working at characterizing the difficult region of the genome known as MHC, or major histocompatibility complex. This region controls a major part of the immune system and is linked with many common diseases. Until now, Dan says, researchers have so far been unable to find causal linkages to common diseases, such as diabetes, celiac disease, and rheumatoid arthritis in the MHC region because they haven't been able to look at long enough pieces of DNA.

Unlike with Mendelian diseases where a single mutation is linked directly to the disease, the regions in the MHC that are linked to disease often include long "flanking" sequences which play a part. Until now, to get a complete look at a long genetic region, researchers have used Illumina's short read technology and then had a lot of data analysis and finishing work to do. First of all, that finishing takes hours and hours. And secondly it doesn't give an accurate picture.

"The finishing is really prohibitive for a modest research effort," says Dan.

Enter new long read technology. Recently Dan worked with PacBio where he was able to get 40kb read lengths. Contrast that with the 300bp reads of the Illumina technology. There's just no comparison, he says. And the error rate for the long reads: one in a million.

"This is really high quality data," Dan says of the PacBio reads. "This is the kind of zero error rate where you can compare your cases and controls and easily validate them and have high confidence that what you're seeing is accurate."

What does this mean for genomic research going forward? Take the 1,000 Genome Project of the NHGRI. Shouldn't researchers be using long reads to get the most accurate data possible?

And just what projects is Dan pursuing with what he calls the "breakthrough" technology?

"We're hot on the trail," he says. "We basically can see the entire picture. We're not looking under a lamppost for the keys. It's daylight, and we can see the whole neighborhood. So we're gonna find the keys."

Podcast brought to you by: Pacific Biosciences - providers of long read sequencing solutions based on their Single Molecule Real Time technology.