bioinformatics


On Bioinformatics Data Sharing and Collaboration: Andrew Carroll, DNAnexus

What does it take to collaborate in genomics?

A platform, for one thing. Over the past few years bioinformaticians have been speculating about a dominant "go to” site that would serve the software needs of those in genomics. Would it be a private company, a Google of genomics? Or would it be a non profit consortium? Would it be created at the government level?

Today we talk to Andrew Carroll, the VP of Science at DNAnexus, a company which has come about the closest of any to being that community platform. Over a year ago, they won a challenge to host PrecisionFDA, a community platform developed for exploring just how to evaluate and regulate NGS assays.

Beginning as a cloud storage company for genomics back when Amazon was just beginning to look to the sky, DNAnexus then evolved into an iTunes-like platform for various genomics software apps. One bioinformatics software on the platform might be great at variant calling, while others specialize in tertiary analysis.

“From the visibility I have, I estimate around a four to five fold increase year over year in the volume of sequencing," says Andrew. "Bioinformatics has grown to the point that it doesn’t pay to be a jack of all trades. A few years ago a project that was a thousand or ten thousand exome scale was a big deal. These days projects are coming up on hundreds of thousands of exomes, even hundreds of thousands of whole genomes."

DNAnexus’ platform isn’t just about a bunch of bioinformatics apps, it’s also a portal where different kinds of organizations can find each other and collaborate; for example, a healthcare provider and a sequencing center. In addition to PrecisionFDA, DNAnexus has been announcing these partnerships, one after another: Regeneron/Geisiner, Rady Children’s Institute for Genomic Medicine, Stanford Genomics Center. The company hasn’t sat back and waited for customers, but have been cultivating a proactive vision for genomic medicine by helping organizations be as open and collaborative as possible with their data.

"The work that we do through our partners actually tells the best story," says Andrew.

By Changing a Basic Lab Step, Acoustic Liquid Transfer Having a Broad Impact

Freeman Dyson famously said, “the great advances in science usually result from new tools rather than from new doctrine.”

Today we talk with Mark Fischer-Colbrie, CEO of Labcyte, a company which has made some waves--literally-- in the life sciences by changing a very fundamental laboratory procedure: liquid transfer. For some years now, Labcyte has been selling machines that move liquid around with sound. By eliminating the need for pipette tips and other “solid” surfaces, the machines guarantee much more precision.

“Science demands precision and in ever-increasing amounts,” says Mark at the outset of today’s interview.

Acting like a rifle shooting liquid straight up, the new acoustic technology has made inroads into most life science applications. Mark talks about the Finnish Institute for Molecular Medicine (FIMM) using the new technology to do truly personalized medicine, by ex-vivo screening of cancer patient cells against hundreds of available drugs. There is often precious little sample to work with, and the errors from traditional pipetting might mean the difference of life or death. The machine is also used widely by the pharma and synthetic biology communities for its ability to reduce costs.

“Imagine saving four months on a single drug discovery cycle,” says Mark.

Recently, Astra Zeneca has integrated acoustic technology into mass spectrometry, showing the potential to immediately upgrade other tools which have been around for some time.

Should everyone change over to acoustic dispensing?

Many Biologists Today Don’t Have Enough Computer Science to Use the Databases

Moray Campbell was for all intents and purposes an accomplished and successful cancer biologist at the renowned Roswell Park Cancer Center. Then one day he woke up and realized he was becoming irrelevant. He was a traditionally trained wet lab biologist who was getting left behind by computer science. Any scientist must keep up with their field, but this was different. A few conferences and journals--reading the news everyday was not going to be enough. Facing reality, Moray enrolled in a bioinformatics masters program at Johns Hopkins.

That was in 2013.

"Biology is genomics. And genomics is basically computer science,” says Moray at the outset of today’s program. “In 2013 I would have said I look at the epigenetics of prostate cancer. Now I say that I look at the epigenomics of prostate cancer. I’ve become genomically literate."

What was it like for Moray to go back to school mid-career with teachers and homework and finals? Did he doubt his decision when the going got tough? Is it harder for biologists to learn coding or coders to learn biology?

Moray is now finished with his degree and in the process learned that as a discipline, we're still struggling with how to teach genomics to biologists.

He gives the example of datasets such as TCGA that many biologists today don’t even know how to use.

“These data are there. And they’re being used very deeply,” he says. "But I suspect by quite a restricted community. If you don’t even know how to download a file, how are you going to be able to analyze it?"

It's been a dramatic transition for Moray. Looking back now he says, "biology is dead; long live biology."

Frontiers of Sequencing: Putting Long Reads and Graph Assemblies to Work

OK, so we get it. Long read sequencing technology is cool. But how cool? Is it another great player on the field, or does it change the game altogether? 

The Mike Schatz lab at Cold Spring Harbor is well know for de novo genome assemblies and their work on structural variation in cancer genomes, so we were curious to hear how long reads have impacted their work. In todays show, lab leader, Mike Schatz, and doctorate student, Maria Nattestad tell of two new projects that include the de novo assembly of a very difficult but important flatworm genome and, secondly, making better variant calls for oncogenes such as HER2.

In the case of the flatworm, Mike says that the move to using PacBio’s long reads improved the assembly by more than a 100 times. That means the difference of looking at a super high resolution picture versus a fuzzy, blurry one, he says. With her work on cancer cell lines, Maria is seeing variants that just weren’t there with short reads. Will her work translate to lower false positive rates for HER2 in clinical diagnostics?

What will be the major headline for sequencing and informatics in 2016?

Mike says we’ll see many more reference genomes done, that the term “reference genome” itself is changing as we go from the one standard reference genome to multiple reference genomes representing the broader population. These new reference genomes are pushing bioinformaticians to come up with new ways to visualize and compare the genomes. Maria details her work into using “graph” assemblies as opposed to the linear approach made popular by the Genome Browser. She says that already a new generation of informaticians are rethinking genome visualization using graph assemblies. (Included below is an image representing her work.)

Neither mentioned it, so we ask at the end, what about Oxford Nanopore’s tech?

 

(The spectral karyotype of the Her2-amplified breast cancer cell line SK-BR-3. The original chromosomes are different colors, so this genome is a complex mixture of various chromosomes. The total number of chromosomes has also jumped from 46 to 80, and there is approximately twice as much DNA as in a normal human genome. Maria Nattestad and Mike Schatz are studying this genome to see how oncogenes like Her2 became amplified while all these changes took place in the big picture of the genome.)

With Two New Easy-to-Use Sequencing Instruments, Thermo Readies for Primetime in the Clinic

The race to the $1,000 genome has been full of breathtaking advances, one after the other. But is next gen sequencing reaching maturity? Will there be that many more significant innovations?

Yes, says our first guest in today’s program, Andy Felton, VP of Product Management at Thermo’s Ion Torrent division. Andy presented Thermo’s two new sequencing instruments, the Ion S5 and the Ion S5XL at a press conference today. While their numbers (accuracy, read length, throughput) don’t look that significant an achievement over the stats of their predecessors--the Personal Genome Machine (PGM) and the Ion Proton--the S5 and S5XL perhaps lead the industry now in ease-of-use.

Integrated with Thermo’s new sample prep station launched last year, the Ion Chef, and robust bioinformatics software, the workflow from sample to report is impressively simple and straight forward. Only two pipetting steps are required. The genomics team at Thermo is betting that this attractive simplicity will open a new market. "Genomics for all," they boast.

Does this just catch Thermo up with Illumina, or does it put them in the lead for clinical sequencing, we ask our second guest, Shawn Baker, CSO of AllSeq. (See Shawn's own blog here.)

Bina CEO Details Secret to Success in NGS Informatics

Last year, pharma giant Roche went on a buying spree, picking up one company after another. In December, when it was announced they had bought out Bina Technologies, many of us were playing catch up. Who is Bina, and how do they fit in the overall bioinformatics space?

Today we hear from Bina's CEO, Narges Bani Asadi. As with many new bioinformatics companies, Bina has changed their service and product since they spun out of Stanford and UC Berkeley four years ago. Narges says that the biggest demand from customers is to provide a comprehensive solution for the entire organization. Often, she says, she encounters brilliant bioinformaticians working at customer organizations who are completely overwhelmed by all of the various informatics tools available. Many of these tools are offered free over the internet, and, she says, it’s creating “open source overload.”

Bina has been a very ambitious company from the start, working to provide NGS users with a comprehensive informatics solution, from beefy, fast infrastructure to an interface for the various kinds of users in an organization, to high powered analytics. And Narges is excited about the Roche buyout, saying that it will speed up their plans. Indeed, just providing bioinformatics solutions to Roche in both their drug and diagnostic divisions is already a huge project.

What was Bina doing so well that attracted Roche, and what will the future NGS informatics ecosystem look like? Join us for an inside look at the world of bioinformatics with one of the space’s most dynamic leaders.

Paperwork, Not Algorithms the Biggest Challenge for Large Bioinformatics Projects, Says David Haussler, UCSC

Guest:

David Haussler, Director, Center for Biomolecular Science and Engineering, UCSC
Bio and Contact Info

Listen (8:08) Paperwork not algorithms the biggest challenge with bioinformatics

Listen (7:01) With Amazon Cloud around are compute and storage still issues?

Listen (3:23) Global Alliance for Genomics and Health

Listen (5:05) What are the technical challenges yet to be tackled?

Listen (7:35) A global bioinformatics utility has to be an NGO

David Haussler and his team at UC Santa Cruz have gone from one large bioinformatics project to another. After creating the original Genome Browser (which still gets over 1 million hits per month), David worked to build a large data set for cancer genomics, The Cancer Genome Atlas.

“With more data comes statistical power,” David says in today’s show. “The only way we can separate out the “driver” mutations from the “passenger” mutations is to have a large sample of different cancers."

This makes sense. One needs millions of samples to see when a mutation is just random, or when it occurs with true statistical frequency. So what have been the challenges to building such a large data set?

David says issues around consent and privacy have actually held up his projects more than any technical difficulties. For example, the NIH has had several meetings for over a year now to determine whether their data can be put on the commercial cloud. In addition there are issues connecting large medical institutions around the country and various countries from around the world. David is a co-founder of the Global Alliance for Genomics and Health, which he says is nearing the tipping point of being THE bioinformatics utility that will be globally adopted.

In the days of commercial offerings such as Amazon Cloud, is compute and storage still a problem? And what, after the privacy issues are seen to, are the technical challenges for bioinformaticians like Haussler?

Podcast brought to you by: National Biomarker Development Alliance - Collaboratively creating standards for end-to-end systems-based biomarker development—to advance precision medicine

Raising the Standards of Biomarker Development - A New Series

We talk a lot on this show about the potential of personalized medicine. Never before have we learned at such breakneck speed just how our bodies function. The pace of biological research staggers the mind and hints at a time when we will “crack the code” of the system that is homo sapiens, going from picking the low hanging fruit to a more rational approach. The high tech world has put at the fingertips of biologists just the tools to do it. There is plenty of compute, plenty of storage available to untangle, or decipher the human body. Yet still, we talk of potential.

Training the Next Generation of Bioinformaticians: Russ Altman, Stanford

Guest:

Russ Altman, Dept Chair, Bioengineering, Stanford University

Bio and Contact Info

Listen (5:32) A bioinformatician bottleneck?

Listen (4:19) Does the engineer or coder have enough basic biology?

Listen (5:04) Have we been overly reductionist?

Listen (5:16) Beautiful but useless algorithms

Listen (4:13) New breakthroughs in natural language processing

Listen (3:39) A new regulatory science

For our last episode in the series, The Bioinformatician Bottleneck, we turned to someone who has not only done lots of bioinformatics projects (he's been lead investigator for the PharmGKB Knowledgebase) but also one who is training the next generation of bioinformaticians. Russ Altman is Director of the Biomedical Informatics program at Stanford. He's also an entertaining speaker who's comfortable with an enormous range of topics.

It's been some time since we had Russ to the program, so we had some catching up to do. What are his thoughts on the recent philosophy of science topics we've been discussing? Are the new biologists becoming mere technicians? What is meant by open data? Etc. He warns of being too black and white when it comes to reductionism or antireductionism. And agrees that the new biologist needs quite a bit of informatics training. But he's not worried that all bioinformaticians have to be better biologists, saying that there's a whole range of jobs out there.

What's Russ excited about in 2014? The increased ability to do natural language processing, he says.

"We have 25 million published abstracts that are freely available. So that's a lot of text. Increasingly we're having access to the full text and figures. I think we're near the point where we'll have an amazing capability to do very high fidelity interpretation of what's being said in these articles," he says in today's interview.

Russ finishes up by talking about a new West Coast FDA center in which he's involved. The center is focused on a program for a new emerging regulatory science, which he defines as the science needed to make good regulatory decisions.

"This area of regulatory science," he says, "has great opportunity to accelerate drug development and drug discovery."

I saw Russ at Stanford's Big Data conference after our interview and asked him at what age he decided against Hollywood and for going into a life of academia and science.

"Who says I did?" he retorted without hesitation.

Podcast brought to you by: Roswell Park Cancer Insititute, dedicated to understanding, preventing and curing cancer for over 115 years.

Stanford’s Big Data in BioMedicine Conference Turns Two

With Silicon Valley blazing on as number one hot spot for high tech and the Bay Area claiming the same for biotech, it makes sense that Stanford, sitting there mid-peninsula basking in all that brilliance, should command a leading role in bioinformatics.



New to Mendelspod?

We advance life science research, connecting people and ideas.
Register here to receive our newsletter.

or skip signup