How PacBio is making it easier for customers to analyze sequencing data
Background
Pacific Biosciences (PacBio) is an award-winning biotechnology company that manufactures instruments for genome sequencing. PacBio offers three complementary sequencing platforms: Sequel and Revio, which deliver accurate long read sequencing, and Onso, which offers high-throughput short read sequencing. PacBio has a large and growing international customer base, who are using these unique technologies to advance research in diverse fields, including human health and agriculture.
Need
Each genome generates large volumes of raw sequencing data, which need to be processed through computationally intensive bioinformatics pipelines in order to produce actionable data, like genome assemblies and variant calls, that can inform research and clinical care. PacBio recognized a need to offer its customers a simple way to run best practices bioinformatics pipelines on their data. They designed their pipelines to be: (1) portable, so that they can be run in commercial cloud platforms as well as on premises, (2) sovereign, so that they can be run in customer environments in any geographic region, (3) human readable, so they are easy to maintain and extend, and (4) engine agnostic, so that customers have optionality to run using their preferred engine. PacBio sought a partner to ensure they delivered on this need.
Solution
PacBio partnered with Bioinformatics Services from Omics AI to test the deployment of the PacBio best practices Workflow Description Language (WDL) pipelines. By deploying onto Workbench PacBio could be confident customers would have easy access to these new pipelines. PacBio also requested a comprehensive framework to be developed for testing, publishing and running pipelines on Workbench, with emphasis on quality and ease of use.
Results
PacBio partnered with DNAstack to help improve customers’ access and capabilities in processing raw data generated from their sequencing instruments. This enabled the PacBio pipelines, which are written in an open source and standards compliant format, WDL, and were designed to be portable, sovereign, human readable, and engine agnostic, to be more broadly accessed. The PacBio WGS Variant pipeline has been tested on commercial cloud platforms, including Microsoft Azure, Amazon Web Services, Google Cloud Platform, and on premises. All pipelines listed on the Workflows Store can be easily run in Workbench. Through this collaboration, PacBio has made it easier and faster to run best practices bioinformatics processing, accelerating the flywheel of discovery for customers using its instruments.
4
Compatible environments
<24 hr
Run time
$30
Compute cost per sample