bioinformatics


On Bioinformatics Data Sharing and Collaboration: Andrew Carroll, DNAnexus

What does it take to collaborate in genomics?

A platform, for one thing. Over the past few years bioinformaticians have been speculating about a dominant "go to” site that would serve the software needs of those in genomics. Would it be a private company, a Google of genomics? Or would it be a non profit consortium? Would it be created at the government level?

Today we talk to Andrew Carroll, the VP of Science at DNAnexus, a company which has come about the closest of any to being that community platform. Over a year ago, they won a challenge to host PrecisionFDA, a community platform developed for exploring just how to evaluate and regulate NGS assays.

Beginning as a cloud storage company for genomics back when Amazon was just beginning to look to the sky, DNAnexus then evolved into an iTunes-like platform for various genomics software apps. One bioinformatics software on the platform might be great at variant calling, while others specialize in tertiary analysis.

“From the visibility I have, I estimate around a four to five fold increase year over year in the volume of sequencing," says Andrew. "Bioinformatics has grown to the point that it doesn’t pay to be a jack of all trades. A few years ago a project that was a thousand or ten thousand exome scale was a big deal. These days projects are coming up on hundreds of thousands of exomes, even hundreds of thousands of whole genomes."

DNAnexus’ platform isn’t just about a bunch of bioinformatics apps, it’s also a portal where different kinds of organizations can find each other and collaborate; for example, a healthcare provider and a sequencing center. In addition to PrecisionFDA, DNAnexus has been announcing these partnerships, one after another: Regeneron/Geisiner, Rady Children’s Institute for Genomic Medicine, Stanford Genomics Center. The company hasn’t sat back and waited for customers, but have been cultivating a proactive vision for genomic medicine by helping organizations be as open and collaborative as possible with their data.

"The work that we do through our partners actually tells the best story," says Andrew.

By Changing a Basic Lab Step, Acoustic Liquid Transfer Having a Broad Impact

Freeman Dyson famously said, “the great advances in science usually result from new tools rather than from new doctrine.”

Today we talk with Mark Fischer-Colbrie, CEO of Labcyte, a company which has made some waves--literally-- in the life sciences by changing a very fundamental laboratory procedure: liquid transfer. For some years now, Labcyte has been selling machines that move liquid around with sound. By eliminating the need for pipette tips and other “solid” surfaces, the machines guarantee much more precision.

“Science demands precision and in ever-increasing amounts,” says Mark at the outset of today’s interview.

Acting like a rifle shooting liquid straight up, the new acoustic technology has made inroads into most life science applications. Mark talks about the Finnish Institute for Molecular Medicine (FIMM) using the new technology to do truly personalized medicine, by ex-vivo screening of cancer patient cells against hundreds of available drugs. There is often precious little sample to work with, and the errors from traditional pipetting might mean the difference of life or death. The machine is also used widely by the pharma and synthetic biology communities for its ability to reduce costs.

“Imagine saving four months on a single drug discovery cycle,” says Mark.

Recently, Astra Zeneca has integrated acoustic technology into mass spectrometry, showing the potential to immediately upgrade other tools which have been around for some time.

Should everyone change over to acoustic dispensing?

Many Biologists Today Don’t Have Enough Computer Science to Use the Databases

Moray Campbell was for all intents and purposes an accomplished and successful cancer biologist at the renowned Roswell Park Cancer Center. Then one day he woke up and realized he was becoming irrelevant. He was a traditionally trained wet lab biologist who was getting left behind by computer science. Any scientist must keep up with their field, but this was different. A few conferences and journals--reading the news everyday was not going to be enough. Facing reality, Moray enrolled in a bioinformatics masters program at Johns Hopkins.

That was in 2013.

"Biology is genomics. And genomics is basically computer science,” says Moray at the outset of today’s program. “In 2013 I would have said I look at the epigenetics of prostate cancer. Now I say that I look at the epigenomics of prostate cancer. I’ve become genomically literate."

What was it like for Moray to go back to school mid-career with teachers and homework and finals? Did he doubt his decision when the going got tough? Is it harder for biologists to learn coding or coders to learn biology?

Moray is now finished with his degree and in the process learned that as a discipline, we're still struggling with how to teach genomics to biologists.

He gives the example of datasets such as TCGA that many biologists today don’t even know how to use.

“These data are there. And they’re being used very deeply,” he says. "But I suspect by quite a restricted community. If you don’t even know how to download a file, how are you going to be able to analyze it?"

It's been a dramatic transition for Moray. Looking back now he says, "biology is dead; long live biology."

With Two New Easy-to-Use Sequencing Instruments, Thermo Readies for Primetime in the Clinic

The race to the $1,000 genome has been full of breathtaking advances, one after the other. But is next gen sequencing reaching maturity? Will there be that many more significant innovations?

Yes, says our first guest in today’s program, Andy Felton, VP of Product Management at Thermo’s Ion Torrent division. Andy presented Thermo’s two new sequencing instruments, the Ion S5 and the Ion S5XL at a press conference today. While their numbers (accuracy, read length, throughput) don’t look that significant an achievement over the stats of their predecessors--the Personal Genome Machine (PGM) and the Ion Proton--the S5 and S5XL perhaps lead the industry now in ease-of-use.

Integrated with Thermo’s new sample prep station launched last year, the Ion Chef, and robust bioinformatics software, the workflow from sample to report is impressively simple and straight forward. Only two pipetting steps are required. The genomics team at Thermo is betting that this attractive simplicity will open a new market. "Genomics for all," they boast.

Does this just catch Thermo up with Illumina, or does it put them in the lead for clinical sequencing, we ask our second guest, Shawn Baker, CSO of AllSeq. (See Shawn's own blog here.)

Bina CEO Details Secret to Success in NGS Informatics

Last year, pharma giant Roche went on a buying spree, picking up one company after another. In December, when it was announced they had bought out Bina Technologies, many of us were playing catch up. Who is Bina, and how do they fit in the overall bioinformatics space?

Today we hear from Bina's CEO, Narges Bani Asadi. As with many new bioinformatics companies, Bina has changed their service and product since they spun out of Stanford and UC Berkeley four years ago. Narges says that the biggest demand from customers is to provide a comprehensive solution for the entire organization. Often, she says, she encounters brilliant bioinformaticians working at customer organizations who are completely overwhelmed by all of the various informatics tools available. Many of these tools are offered free over the internet, and, she says, it’s creating “open source overload.”

Bina has been a very ambitious company from the start, working to provide NGS users with a comprehensive informatics solution, from beefy, fast infrastructure to an interface for the various kinds of users in an organization, to high powered analytics. And Narges is excited about the Roche buyout, saying that it will speed up their plans. Indeed, just providing bioinformatics solutions to Roche in both their drug and diagnostic divisions is already a huge project.

What was Bina doing so well that attracted Roche, and what will the future NGS informatics ecosystem look like? Join us for an inside look at the world of bioinformatics with one of the space’s most dynamic leaders.

Paperwork, Not Algorithms the Biggest Challenge for Large Bioinformatics Projects, Says David Haussler, UCSC

Guest:

David Haussler, Director, Center for Biomolecular Science and Engineering, UCSC
Bio and Contact Info

Listen (8:08) Paperwork not algorithms the biggest challenge with bioinformatics

Listen (7:01) With Amazon Cloud around are compute and storage still issues?

Listen (3:23) Global Alliance for Genomics and Health

Listen (5:05) What are the technical challenges yet to be tackled?

Listen (7:35) A global bioinformatics utility has to be an NGO

David Haussler and his team at UC Santa Cruz have gone from one large bioinformatics project to another. After creating the original Genome Browser (which still gets over 1 million hits per month), David worked to build a large data set for cancer genomics, The Cancer Genome Atlas.

“With more data comes statistical power,” David says in today’s show. “The only way we can separate out the “driver” mutations from the “passenger” mutations is to have a large sample of different cancers."

This makes sense. One needs millions of samples to see when a mutation is just random, or when it occurs with true statistical frequency. So what have been the challenges to building such a large data set?

David says issues around consent and privacy have actually held up his projects more than any technical difficulties. For example, the NIH has had several meetings for over a year now to determine whether their data can be put on the commercial cloud. In addition there are issues connecting large medical institutions around the country and various countries from around the world. David is a co-founder of the Global Alliance for Genomics and Health, which he says is nearing the tipping point of being THE bioinformatics utility that will be globally adopted.

In the days of commercial offerings such as Amazon Cloud, is compute and storage still a problem? And what, after the privacy issues are seen to, are the technical challenges for bioinformaticians like Haussler?

Podcast brought to you by: National Biomarker Development Alliance - Collaboratively creating standards for end-to-end systems-based biomarker development—to advance precision medicine

Raising the Standards of Biomarker Development - A New Series

We talk a lot on this show about the potential of personalized medicine. Never before have we learned at such breakneck speed just how our bodies function. The pace of biological research staggers the mind and hints at a time when we will “crack the code” of the system that is homo sapiens, going from picking the low hanging fruit to a more rational approach. The high tech world has put at the fingertips of biologists just the tools to do it. There is plenty of compute, plenty of storage available to untangle, or decipher the human body. Yet still, we talk of potential.

Training the Next Generation of Bioinformaticians: Russ Altman, Stanford

Guest:

Russ Altman, Dept Chair, Bioengineering, Stanford University

Bio and Contact Info

Listen (5:32) A bioinformatician bottleneck?

Listen (4:19) Does the engineer or coder have enough basic biology?

Listen (5:04) Have we been overly reductionist?

Listen (5:16) Beautiful but useless algorithms

Listen (4:13) New breakthroughs in natural language processing

Listen (3:39) A new regulatory science

For our last episode in the series, The Bioinformatician Bottleneck, we turned to someone who has not only done lots of bioinformatics projects (he's been lead investigator for the PharmGKB Knowledgebase) but also one who is training the next generation of bioinformaticians. Russ Altman is Director of the Biomedical Informatics program at Stanford. He's also an entertaining speaker who's comfortable with an enormous range of topics.

It's been some time since we had Russ to the program, so we had some catching up to do. What are his thoughts on the recent philosophy of science topics we've been discussing? Are the new biologists becoming mere technicians? What is meant by open data? Etc. He warns of being too black and white when it comes to reductionism or antireductionism. And agrees that the new biologist needs quite a bit of informatics training. But he's not worried that all bioinformaticians have to be better biologists, saying that there's a whole range of jobs out there.

What's Russ excited about in 2014? The increased ability to do natural language processing, he says.

"We have 25 million published abstracts that are freely available. So that's a lot of text. Increasingly we're having access to the full text and figures. I think we're near the point where we'll have an amazing capability to do very high fidelity interpretation of what's being said in these articles," he says in today's interview.

Russ finishes up by talking about a new West Coast FDA center in which he's involved. The center is focused on a program for a new emerging regulatory science, which he defines as the science needed to make good regulatory decisions.

"This area of regulatory science," he says, "has great opportunity to accelerate drug development and drug discovery."

I saw Russ at Stanford's Big Data conference after our interview and asked him at what age he decided against Hollywood and for going into a life of academia and science.

"Who says I did?" he retorted without hesitation.

Podcast brought to you by: Roswell Park Cancer Insititute, dedicated to understanding, preventing and curing cancer for over 115 years.

Myths of Big Data with Sabina Leonelli, Philosopher of Information

Guest:

Sabina Leonelli, Philosopher, University of Exeter

Bio and Contact Info

Listen (6:44) Not a fan of the term Big Data

Listen (4:20) Something lost in bringing data together from various scientific cultures

Listen (3:36) Are data scientists really scientists?

Listen (4:11) Controversies around Open Data

Listen (3:03) Data systems come with their own biases

Listen (6:22) Message to bioinformaticians: Come up with the story of your data

Listen (1:15) Data driven vs hypothesis driven science

Listen (2:46) Thoughts on the Quantified Self movement

For the next installment in our Philosophy of Science series, we look at issues around data. Sabina Leonelli is a philosopher of information who collaborates with bioinformaticians. In today's interview, she expresses her concerns about the terms Big Data and Open Data.

"I have to admit, I'm not a big fan of this expression, 'Big Data,'" she says at the outset of the show.

Using data in science is, of course, a very old practice. So what's new about "big" data? Sabina is mostly concerned about the challenges of bringing data together from various sources. The biggest challenge here, she says, is with classification.

"Biology is fragmented in a lot of different epistemic cultures . . and each research tradition has different preferred ways of doing things," she points out. "What I'm interested in is the relationship between the language used and the actual practices. And there appears to be a very strong relationship between the way that people perform their research and the way in which they think about it. So terminology becomes a very specific signal for the various research traditions."

Sabina goes on to point out that the nuances of specific research traditions can be lost as data is integrated with other traditions. For instance, most large bioinformatics databases are done in English, whereas some of the individual research data may have been originally done in another language.

This becomes especially important with the new movement toward Open Data, where biases are built into the databases.

"The problem resides with the expectation that what is 'Open Data' is all the data there is," she says.

In fact, the data in Open Data tends to come from databases which are highly standardized and often from the most powerful labs.

How can bioinformaticians deal with these challenges? Sabina says researchers should be more diligent about creating "a story" around their data. This will help make the biases more transparent. She also says that a lot of conceptual effort must go into creating databases from the outset so that the data might be used for yet unknown questions in the future.

We finish the interview with her thoughts on the Quantified Self movement.

Podcast brought to you by: Chempetitive Group - "We love science. We love marketing. We love the idea of combining the two to make great things happen for your marketing communications."



New to Mendelspod?

We advance life science research, connecting people and ideas.
Register here to receive our newsletter.

or skip signup