News

Advancing a new era of breakthrough biomedical discoveries

For media requests, please contact press@dnastack.com.

Download Press Kit
News
August 11, 2020

COVID-19 Researchers Get a Boost From AI-Powered Genomics Cloud

DNAstack is helping scientists around the globe better understand COVID-19, so they can develop treatments and vaccines.

News
August 11, 2020

COVID-19 Researchers Get a Boost From AI-Powered Genomics Cloud

DNAstack is helping scientists around the globe better understand COVID-19, so they can develop treatments and vaccines.

DNAstack is helping scientists around the globe better understand COVID-19, so they can develop treatments and vaccines.

News
August 11, 2020

COVID-19 Researchers Get a Boost From AI-Powered Genomics Cloud

DNAstack is helping scientists around the globe better understand COVID-19, so they can develop treatments and vaccines.

News
August 11, 2020

COVID-19 Researchers Get a Boost From AI-Powered Genomics Cloud

DNAstack is helping scientists around the globe better understand COVID-19, so they can develop treatments and vaccines.

News
August 6, 2020

Harmonized Variant Calling for SARS-CoV-2 Genomes

To combat the current COVID-19 pandemic, scientists around the world are sequencing viral genomes at an accelerated pace.

These sequences are then being deposited into a number of international databases, including the National Center for Biotechnology Information (NCBI; Figure 1). There are limitations to this approach as multiple databases makes it challenging for a single researcher on their own to consolidate data from different sources. These data were generated and processed by different research groups at different institutions resulting in batch effects when amalgamated -- differences in signal across groups of viral genomes processed together that represent technical noise and not biological variation.  The distributed nature of the data and the lack of uniformity in data generation and processing hinders the pace of scientific discovery. To accelerate discovery, we need to leverage the breadth of data available internationally which requires data consolidation from multiple sources and data “cleaning” to reduce technical artifacts introduced during data processing.

News
August 6, 2020

Harmonized Variant Calling for SARS-CoV-2 Genomes

To combat the current COVID-19 pandemic, scientists around the world are sequencing viral genomes at an accelerated pace.

These sequences are then being deposited into a number of international databases, including the National Center for Biotechnology Information (NCBI; Figure 1). There are limitations to this approach as multiple databases makes it challenging for a single researcher on their own to consolidate data from different sources. These data were generated and processed by different research groups at different institutions resulting in batch effects when amalgamated -- differences in signal across groups of viral genomes processed together that represent technical noise and not biological variation.  The distributed nature of the data and the lack of uniformity in data generation and processing hinders the pace of scientific discovery. To accelerate discovery, we need to leverage the breadth of data available internationally which requires data consolidation from multiple sources and data “cleaning” to reduce technical artifacts introduced during data processing.

To combat the current COVID-19 pandemic, scientists around the world are sequencing viral genomes at an accelerated pace.

These sequences are then being deposited into a number of international databases, including the National Center for Biotechnology Information (NCBI; Figure 1). There are limitations to this approach as multiple databases makes it challenging for a single researcher on their own to consolidate data from different sources. These data were generated and processed by different research groups at different institutions resulting in batch effects when amalgamated -- differences in signal across groups of viral genomes processed together that represent technical noise and not biological variation.  The distributed nature of the data and the lack of uniformity in data generation and processing hinders the pace of scientific discovery. To accelerate discovery, we need to leverage the breadth of data available internationally which requires data consolidation from multiple sources and data “cleaning” to reduce technical artifacts introduced during data processing.

To address this critical gap and accelerate scientific discovery, DNAstack released COVID Cloud, a cloud-based solution that uniquely indexes and integrates data from multiple international sources into a unified data lake. Identifying mutations in the viral genome can help researchers design novel therapeutics and track viral transmissions. COVID Cloud provides easily accessible viral genome data ready for analysis. The data in COVID Cloud can be browsed through apps providing different perspectives including faceted search, point lookup, and 3D visualizations. Users can also export the data into downstream analytical workspaces, such as Jupyter Notebooks, Power BI, or DNAstack’s Workflow Execution Service.

 

Figure 1

SARS-CoV-2 variant detection

To facilitate high-quality, reproducible variant calling at scale, we have developed and published an open source workflow written in Workflow Descriptive Language (WDL) (available on Dockstore and Github). The workflow has two sub-workflows: one to handle long-read (i.e. Nanopore) and one to handle short-read, paired-end (i.e. Illumina) sequencing data (see Nanopore variant calling and Illumina paired-end variant calling sections for more details). These workflows are designed to take NCBI run accessions as input (used to access raw FASTQ files) and return high-confident variant calls (e.g. VCF files) as well as a consensus genome sequence.

We deployed our WDL variant calling workflows to identify mutations in 10,838 amplicon-based viral sequences hosted on NCBI (10,664 unique samples). Of these, 4,427 are Illumina paired-end short read sequences and 4,471 are Nanopore long read sequences. The resulting variant calls, as well as links to per-sample VCFs and assemblies, have been made freely available for exploration and download at COVID Cloud.

To further promote scalable and reproducible science, we have published both the Illumina and Nanopore WDL workflows. Below is a brief tutorial on how to call variants locally using the DNAstack variant calling workflow.

Running the workflow

The COVID-19 variant calling workflow is available on DNAstack’s GitHub. The workflow may also be viewed on Dockstore. The tools and pipelines used in the workflow have been packaged into publicly available Docker images, allowing the workflow to be run reproducibly across compute environments. Instructions for running the pipeline locally in addition to test input files can be found in the workflow documentation. Briefly, to run the workflow locally, simply edit the input_template.json file found in the inputs directory of the GitHub repository to specify the parameters for the sample of interest, then run it with a workflow runner of your choice, e.g.:

Using Cromwell:

java -jar cromwell.jar run main.wdl -i input_template.json

Using miniwdl:

miniwdl run main.wdl -i input_template.json

The required parameters are the run accession from NCBI (explore SARS-CoV-2 run data); the library type (NANOPORE or ILLUMINA_PE) to determine which pipeline to run; and a file and version number indicating the primer scheme that was used to prepare the library.

Nanopore variant calling

Our Nanopore variant calling workflow leverages the ARTIC bioinformatics protocol (Loman et al., 2020), as implemented by the Connor lab. Briefly, reads are filtered and mapped to the SARS-CoV-2 reference genome (MN908947.3) using minimap2, following which amplicon primer sequences are trimmed. Next, medaka uses neural networks to create a consensus sequence (Oxford Nanopore Technologies, 2018). Medaka is also used to call variants, which are then fed into longshot to produce a set of high-confidence variants. Bcftools is used to generate the final consensus assembly sequence.

For more details please see the ARTIC bioinformatics protocol documentation and the GitHub for the nextflow implementation of this protocol we use in our workflow.

Illumina paired-end variant calling

Our Illumina paired-end variant calling workflow leverages the SARS-CoV-2 Illumina GeNome Assembly Line (SIGNAL) protocol, produced by the McArthur lab. Briefly, reads are mapped to the human genome (GRCh38) using BWA-MEM to remove host reads (Li and Durbin, 2009). Next, adapters are trimmed and reads are mapped to the reference genome using BWA-MEM. Primer sequences are removed and variants are called using ivar with default parameters (Grubaugh et al., 2018). Ivar is also used to produce a consensus assembly sequence.

For more details please see the SIGNAL GitHub and documentation.

We are immensely grateful to the labs and individuals responsible for creating and open-sourcing the pipelines we have chosen to run this COVID analysis.

Our goals were twofold; we wanted to produce robust, reliable data that could be freely distributed to researchers in the hope that such a large volume of data could help provide novel insight into the virus, and we wanted to provide an easy-to-use pipeline for others aiming to process their own data or to reproduce our analysis on the NCBI data. 

We will continue to iterate and improve upon our analysis pipelines, ingesting more data every day as it becomes available. In the near future, we plan to add an Illumina single-end pipeline, as well as a method to process metagenomic samples. We also plan to identify and ingest from more databases that expose raw sequencing data and metadata; while assembled genomes are valuable, per-site confidence can only be established when raw reads are available. Since our workflows are written in WDL and their environments are containerized, they can be reproducibly run in virtually any compute environment, ensuring accurate results in an infrastructure-independent way.

We believe that making science open, from the raw data, to analysis methods, to sharing results is essential not only to combat fast-moving diseases such as COVID, but also generally to increase the momentum of research across domains. It is our goal to continue to package and share best-practices workflows, analytics, data and results, so that researchers around the world can more rapidly leverage these resources that may otherwise not be available to them.

Resources

References and Further Reading

News
August 6, 2020

Harmonized Variant Calling for SARS-CoV-2 Genomes

To combat the current COVID-19 pandemic, scientists around the world are sequencing viral genomes at an accelerated pace.

These sequences are then being deposited into a number of international databases, including the National Center for Biotechnology Information (NCBI; Figure 1). There are limitations to this approach as multiple databases makes it challenging for a single researcher on their own to consolidate data from different sources. These data were generated and processed by different research groups at different institutions resulting in batch effects when amalgamated -- differences in signal across groups of viral genomes processed together that represent technical noise and not biological variation.  The distributed nature of the data and the lack of uniformity in data generation and processing hinders the pace of scientific discovery. To accelerate discovery, we need to leverage the breadth of data available internationally which requires data consolidation from multiple sources and data “cleaning” to reduce technical artifacts introduced during data processing.

News
August 6, 2020

Harmonized Variant Calling for SARS-CoV-2 Genomes

To combat the current COVID-19 pandemic, scientists around the world are sequencing viral genomes at an accelerated pace.

These sequences are then being deposited into a number of international databases, including the National Center for Biotechnology Information (NCBI; Figure 1). There are limitations to this approach as multiple databases makes it challenging for a single researcher on their own to consolidate data from different sources. These data were generated and processed by different research groups at different institutions resulting in batch effects when amalgamated -- differences in signal across groups of viral genomes processed together that represent technical noise and not biological variation.  The distributed nature of the data and the lack of uniformity in data generation and processing hinders the pace of scientific discovery. To accelerate discovery, we need to leverage the breadth of data available internationally which requires data consolidation from multiple sources and data “cleaning” to reduce technical artifacts introduced during data processing.

News
July 30, 2020

DNAstack Launches Genomic Data Explorer to Accelerate Research in COVID-19

DNAstack today announced COVID Cloud, an online destination for exploring one of the largest collections of viral genome sequences in the world. 

As the global caseload of COVID-19 exceeds 16.5 million people in over 200 countries, scientists are racing to study the genetics of the virus that causes it, SARS-CoV-2, to inform the development of urgently needed diagnostics and treatments. COVID Cloud is a software solution created by DNAstack that connects and shares a large and growing number of viral genomes seen around the world combined with visualization and analytical tools for scientists to examine the molecular machinery of the virus as it continues to spread and evolve.

News
July 30, 2020

DNAstack Launches Genomic Data Explorer to Accelerate Research in COVID-19

DNAstack today announced COVID Cloud, an online destination for exploring one of the largest collections of viral genome sequences in the world. 

As the global caseload of COVID-19 exceeds 16.5 million people in over 200 countries, scientists are racing to study the genetics of the virus that causes it, SARS-CoV-2, to inform the development of urgently needed diagnostics and treatments. COVID Cloud is a software solution created by DNAstack that connects and shares a large and growing number of viral genomes seen around the world combined with visualization and analytical tools for scientists to examine the molecular machinery of the virus as it continues to spread and evolve.

DNAstack today announced COVID Cloud, an online destination for exploring one of the largest collections of viral genome sequences in the world. 

As the global caseload of COVID-19 exceeds 16.5 million people in over 200 countries, scientists are racing to study the genetics of the virus that causes it, SARS-CoV-2, to inform the development of urgently needed diagnostics and treatments. COVID Cloud is a software solution created by DNAstack that connects and shares a large and growing number of viral genomes seen around the world combined with visualization and analytical tools for scientists to examine the molecular machinery of the virus as it continues to spread and evolve.

“By sharing genetic data globally, we can mount a sort of digital immune response to help us defend against this and future outbreaks,” said Marc Fiume, CEO of DNAstack. “With COVID Cloud, we can help scientists take the best technologies in genomics, data sharing, cloud computing, and machine learning to the fight against COVID-19.”

COVID Cloud provides unified access to a globally representative repository of viral genomes, which is updated daily with new sequences from international biobanks. In order to reduce errors that arise when comparing datasets from multiple sources, DNAstack processes raw data using harmonized bioinformatics pipelines. These pipelines have been authored in the platform-agnostic Workflow Description Language and published as open source on Dockstore and Github, to promote reproducible science and community collaboration.

[caption id="attachment_2915" align="alignnone" width="800"]

Using Variants, researchers can search the entire catalog of mutations found in SARS-CoV-2 sequences[/caption]

Datasets are shared over an integrated set of APIs defined by the Global Alliance for Genomics & Health, providing a standards-compliant platform on which the community can build powerful integrations and applications. For example, all of the files in COVID Cloud are served over the GA4GH Data Repository Service, a vendor-neutral way of representing files, to streamline their use in downstream analytical environments such as Jupyter Notebooks, Microsoft Power BI, and DNAstack’s Workflow Execution Service.

COVID Cloud also gives scientists intuitive controls for interactive exploration of the data. Using the Sequences tool, users can search over the entire catalogue of genomics data and information about the original source, collection date, and geographic location. Beacon lets scientists look up the prevalence of specific genetic mutations, such as D614G, a variant that appears to make SARS-CoV-2 more transmissible. With Molecules, researchers can manipulate three-dimensional representations of proteins encoded by the viral genome, like the Spike protein, in order to understand their physical conformations and predict how genetic mutations and therapeutic interventions may impact their function.

[video width="350" mp4="https://dnastack.com/corporate/wp-content/uploads/2020/07/covid-cloud-molecules-iphone.mp4" loop="true" autoplay="true"][/video]

COVID Cloud is hosted by DNAstack as a free service deployed on Microsoft Azure. The software that powers COVID Cloud is available to license for sharing public or private collections of genomics and clinical data related to COVID-19 or other disease areas.

The development of COVID Cloud has been supported through feasibility funding of the Digital Technology Supercluster’s COVID-19 Program, which aims to improve the health and safety of Canadians and support Canada’s ability to address issues created by the COVID-19 outbreak.

About DNAstack

DNAstack’s mission is to improve the lives of millions of people by breaking down barriers to data sharing and discovery. DNAstack develops standards and technologies for scientists to more efficiently find, access, and analyze the world’s exponentially growing volumes of genomic and biomedical data. For additional support or partnership interest, please contact us by email to info@dnastack.com.

About Digital Technology Supercluster

The Digital Technology Supercluster solves some of industry’s and society’s biggest problems through Canadian-made technologies. We bring together private and public sector organizations of all sizes to address challenges facing Canada’s economic sectors including healthcare, natural resources, manufacturing, and transportation. Through this ‘collaborative innovation,’ the Supercluster helps to drive solutions better than any single organization could on its own. The Digital Technology Supercluster is led by industry leaders such as D-Wave, LifeLabs, LlamaZOO, Lululemon, MDA, Microsoft, Mosaic Forest Management, Sanctuary AI, Teck Resources Limited, TELUS,  Terramera, and 1Qbit. Together, we work to position Canada as a global hub for digital innovation. A full list of Members can be found here.

Media Inquiries

For DNAstack media inquiries: Christine Beyaert, christine@dnastack.com

For Digital Technology Supercluster related media inquiries: Elysa Darling, elysa@switchboardpr.com

News
July 30, 2020

DNAstack Launches Genomic Data Explorer to Accelerate Research in COVID-19

DNAstack today announced COVID Cloud, an online destination for exploring one of the largest collections of viral genome sequences in the world. 

As the global caseload of COVID-19 exceeds 16.5 million people in over 200 countries, scientists are racing to study the genetics of the virus that causes it, SARS-CoV-2, to inform the development of urgently needed diagnostics and treatments. COVID Cloud is a software solution created by DNAstack that connects and shares a large and growing number of viral genomes seen around the world combined with visualization and analytical tools for scientists to examine the molecular machinery of the virus as it continues to spread and evolve.

News
July 30, 2020

DNAstack Launches Genomic Data Explorer to Accelerate Research in COVID-19

DNAstack today announced COVID Cloud, an online destination for exploring one of the largest collections of viral genome sequences in the world. 

As the global caseload of COVID-19 exceeds 16.5 million people in over 200 countries, scientists are racing to study the genetics of the virus that causes it, SARS-CoV-2, to inform the development of urgently needed diagnostics and treatments. COVID Cloud is a software solution created by DNAstack that connects and shares a large and growing number of viral genomes seen around the world combined with visualization and analytical tools for scientists to examine the molecular machinery of the virus as it continues to spread and evolve.

News
July 13, 2020

SARS-CoV-2: Biology Origins, and How Open Science is Accelerating the Search for Therapeutic Answers

DNAstack's bioinformatician Heather Ward breaks down the biology of the novel coronavirus responsible for the COVID-19 outbreak.

News
July 13, 2020

SARS-CoV-2: Biology Origins, and How Open Science is Accelerating the Search for Therapeutic Answers

DNAstack's bioinformatician Heather Ward breaks down the biology of the novel coronavirus responsible for the COVID-19 outbreak.

DNAstack's bioinformatician Heather Ward breaks down the biology of the novel coronavirus responsible for the COVID-19 outbreak.

Introduction

Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is the novel coronavirus responsible for the COVID-19 outbreak that first emerged in early December 2019 in Wuhan, China. As of March 20, 2020 SARS-CoV-2 has resulted in nearly 250,000 cases worldwide, claiming the lives of over 10,000 people.

Here, I’ll briefly break down the potential origins and viral life cycle of SARS-CoV-2, how it differs from the virus responsible for the 2002 outbreak, and how genomics and open science can be used to explore and develop therapeutics that will help mitigate this global threat.

SARS-CoV-2 and related coronaviruses

SARS-CoV-2 is a coronavirus, members of a class of positive-sense single-stranded RNA (ssRNA) viruses so named due to their resemblance to solar coronas. Other ssRNA viruses cause diseases which range in severity, including HIV, West Nile, and the common cold.

There are several coronaviruses known to infect humans, with the most well-known being SARS-CoV (responsible for the 2002 outbreak) and MERS-CoV (Middle Eastern Respiratory Syndrome Coronavirus). Both of these coronaviruses, as well as the current SARS-CoV-2, are believed to have originated in bats, which act as a natural reservoir for a number of coronaviruses. The virus is postulated to pass to humans via an intermediary host (civet cats in the case of SARS-CoV, and dromedary camels for MERS-CoV). Several potential hosts have been suggested as the intermediary for the current SARS-CoV-2, including snakes and pangolins.

It’s important to note that the majority of these bat-endemic coronaviruses are not able to infect humans, and mutation is required for a coronavirus to be able to transition to a new host organism. To obtain insight into which parts of the genome require mutation to allow a virus the ability to target a new host first requires an understanding of the basics of the coronavirus viral life cycle.

[caption id="attachment_3926" align="aligncenter" width="701"]

Figure 1: SARS-CoV-2 virion. [/caption]

The SARS-CoV-2 viral life cycle

The major steps of the viral life cycle of SARS-CoV-2 as well as other coronaviruses include:

  1. Binding of the virus to a receptor on a target host cell
  2. Membrane fusion between the viral envelope and the host cell, which releases the viral genome into the host cell
  3. Replication of the viral genome
  4. Transcription and translation of viral structural proteins
  5. Assembly and export of mature virions

Mature virions (packaged viral particles including the viral genome and structural proteins, see the SARS-CoV-2 virion pictured in figure 1) released from an infected host cell may infect other cells and continue the infection cycle.

If virions are unable to bind to host cell receptors or if membrane fusion does not occur, infection will not take place. These key steps are both mediated by a particular viral protein — the spike protein.

The Spike protein

The spike protein is a homotrimeric (made up of three identical peptides) transmembrane protein found studded around the exterior of the mature virion. Each monomer (one of the three identical peptides) is comprised of two subunits: the S1 subunit, which is responsible for recognizing and binding to a host cell receptor, and the S2 subunit, which facilitates membrane fusion and release of the viral genome into the host cell (see figure 2).

Because the virus can only infect host cells that it is able to bind to, the S1 subunit of the spike protein is responsible for host specificity — the range of hosts that the virus is able to infect. In order for a virus to be able to infect a new organism — e.g. in the transition between bat and human hosts — the receptor binding domain of the S1 subunit must gain the ability to bind to a receptor found in that new host. In both SARS-CoV and SARS-CoV-2, the human receptor appears to be the protein angiotensin converting enzyme 2 (ACE2), which is found on the surface of cells in the human respiratory tract. Interestingly, despite targeting the same receptor protein, many of the key amino acids that interact with the ACE2 receptor and that were previously thought to be essential for binding to ACE2 appear to be almost completely distinct between the SARS-CoV and SARS-CoV-2 receptor binding domains, implying that specificity for the same receptor may have evolved independently in each strain.

[caption id="attachment_3928" align="aligncenter" width="587"]

Figure 2: Structure of the SARS-CoV spike protein monomer (blue and green) bound to the ACE2 receptor (yellow). The spike protein is comprised of the S1 (blue) and S2 (green) subunits. S1/S2 and S2' cleavage sites are labelled in red. Generated using open-source PyMOL™ from the cryo-EM structure.[/caption]

Activation of the spike protein following receptor binding

Receptor binding alone is not sufficient for viral infection. Binding initiates conformational changes in the spike protein that lead to membrane fusion and infection, but another step is required before fusion can take place: cleavage of the spike protein.

There are at least two cleavage sites on the spike protein that must be cut prior to viral entry; one between the S1 and S2 subunits (S1/S2 site) and one internal to the S2 subunit (S2' site) (see figure 2, red). Cleavage at the S1/S2 site primes the protein and leads to cleavage of the S2' site, which is necessary for membrane fusion. The specific proteases (proteins that cut other proteins) that are able to perform the cleavage steps depend on the amino acid sequence that is present at each cleavage site; in many cases, several different proteases are able to cut the same site with greater or lesser efficiency.

Similar to the host-specificity of the receptor binding domain, if cleavage sites are not recognized by host proteases, cleavage and therefore infection will not be able to occur in that host. This means that both a receptor binding domain that recognizes a host target as well as cleavage sites that can be cut by host proteases are required for transmission of the virus to a novel host. For example, some bat coronaviruses have been found that are able to bind to human proteins but fail to initiate infection because their spike protein is not cleaved in human hosts.

A novel cleavage site on SARS-CoV-2

In SARS-CoV-2, a novel cleavage site has been discovered at the S1/S2 junction which is cleaved by a ubiquitous human protease known as furin. The inclusion of this novel furin site allows the SARS-CoV-2 spike protein to be cleaved during biosynthesis — this means that the protein is ‘primed’ even prior to release of the virion from the host cell. This is in contrast to the spike protein produced by SARS-CoV, which lacks this site and is released from the cell intact, requiring later cleavage before it can facilitate membrane fusion.

It is unclear whether priming during biosynthesis has an impact on viral infectivity; a 2006 study by Follis et al. found that the introduction of a furin cleavage site into SARS-CoV’s spike protein at the S1/S2 junction resulted in enhanced membrane fusion between virus and host, but could find no evidence for an accompanying increase in infectivity. It remains to be seen how the novel furin site in SARS-CoV-2 will impact its infectivity and spread.

A key target for therapeutic agents

Researchers across the globe are searching the SARS-CoV-2 genome for features that will allow it to be targeted by therapeutic agents. Due to the nature of the spike protein and its fundamental role in mediating host specificity and viral infection, it represents an attractive target for the development of therapeutic agents. In particular, mechanisms targeting receptor binding, proteolytic cleavage, and membrane fusion may prove effective in attenuating the virus’s ability to infect human cells. Due to the genetic similarity between the novel SARS-CoV-2 and SARS-CoV, including their shared receptor target, it is possible that agents shown to be effective against SARS-CoV may also prove effective at slowing SARS-CoV-2.

SARS-CoV-2 Research

The swift response of researchers worldwide to study SARS-CoV-2 and to share sequencing data publicly has allowed for rapid insights into key genetic features that will prove indispensable in the days and months to come. This tremendous, coordinated global effort to elucidate the origins and mechanisms of the virus could not have been accomplished without the aid of modern technologies allowing researchers to share data quickly across geopolitical borders. This reaffirms the essential role of technology in facilitating science, especially in the ability to respond quickly to global emergencies.

To that end, DNAstack has developed a beacon for SARS-CoV-2 where users can explore aggregated genetic variants discovered by labs worldwide. Explore it here: covid-19.dnastack.com.

About the Author

Heather is part of the Data Science Team at DNAstack, where she authors, tests, and runs analytical pipelines for internal and customer projects

References and Further Reading

  • Belouzard, S., Chu, V.C. and Whittaker, G.R. 2009. Activation of the SARS coronavirus spike protein via sequential proteolytic cleavage at two distinct sites. PNAS106(14): 5871–5876.
  • Chan, J.F.W., Kok, K-H., Zhu, Z., Chu, H., To, K. K-W., Yuan, S. and Yuen, K-Y. 2020. Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan. Emerging Microbes & Infections9: 221–246.
  • Coutard, B., Valle, C., de Lamballerie, X., Canard, B., Seidah, N.G. and Decroly, E. 2020. The spike glycoprotein of the new coronavirus 2019-nCoV contains a furin-like cleavage site absent in CoV of the same clade. Antiviral research176: 104742.
  • Gong, S. and Bao, L-L. 2018. The battle against SARS and MERS coronaviruses: Reservoirs and animal models. Animal Model Exp Med. 1:125–133.
  • Follis, K.E., York, J. and Numberg, J.H. 2006. Furin cleavage of the SARS coronavirus spike glycoprotein enhances cell-cell fusion but does not affect virion entry. Virology. 350:358–369.
  • Millet, J.K. and Whittaker, G.R. 2015. Host cell proteases: critical determinants of coronavirus tropism and pathogenesis. Virus Research. 202: 120–134.
  • Racaniello, V. Furin cleavage site in the SARS-CoV-2 coronavirus glycoprotein. Virology blog. http://www.virology.ws/2020/02/13/furin-cleavage-site-in-the-sars-cov-2-coronavirus-glycoprotein/. Published February 13, 2020. Accessed March 10, 2020.
  • Song, W., Gui, M., Wang, X. and Xiang, Y. 2018. Cryo-EM structure of the SARS coronavirus spike glycoprotein in complex with its host cell receptor ACE2. PLOS Pathogenshttps://doi.org/10.1371/journal.ppat.1007236
  • Walls, A.C., Park, Y-J., Tortorici, M.J., Wall, A., McGuire, A.T. and Veesler, D. 2020. Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell180: 1–12.
  • Wong, M.C., Cregeen, S.J., Ajami, N.J. and Petrosino, J.F. 2020 (preprint). Evidence of recombination in coronaviruses implicating pangolin origins of nCoV-2019. bioRxiv, preprint. https://doi.org/10.1101/2020.02.07.939207
  • Xia, S., Zhu, Y., Liu, M., Lan, Q., Xu, W., Wu, Y., Ying, T., Liu, S., Shi, Z., Jiang, S. and Lu, L. 2020. Fusion mechanism of 2019-nCoV and fusion inhibitors targeting HR1 domain in spike protein. Cellular & Molecular Immunologyhttps://doi.org/10.1038/s41423-020-0374-2
  • Xu, X., Chen, P., Wang, J., Feng, J., Zhou, H., Li, X., Zhong, W. and Hao, P. 2020. Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission. Science China Life Sciences63(3): 457–460.

About DNAstack

DNAstack’s mission is to improve the lives of millions of people by breaking down barriers to data sharing and discovery. DNAstack develops standards and technologies for scientists to more efficiently find, access, and analyze the world’s exponentially growing volumes of genomic and biomedical data. For additional support or partnership interest, please contact us by email to info@dnastack.com.

Photo Credits 

Figure 1: CDC/Alissa Eckert, MS; Dan Higgins, MAMSFigure 2: Song et al., 2018; PDB accession 6ACK.

News
July 13, 2020

SARS-CoV-2: Biology Origins, and How Open Science is Accelerating the Search for Therapeutic Answers

DNAstack's bioinformatician Heather Ward breaks down the biology of the novel coronavirus responsible for the COVID-19 outbreak.

News
July 13, 2020

SARS-CoV-2: Biology Origins, and How Open Science is Accelerating the Search for Therapeutic Answers

DNAstack's bioinformatician Heather Ward breaks down the biology of the novel coronavirus responsible for the COVID-19 outbreak.

Podcast
May 11, 2020

Omics Xchange Podcast - COVID-19 Beacon: an Interview with Marc Fiume

Press Releases
May 4, 2020

Software Tool Built by U of T Startup Shares Genetic Data with COVID-19 Researchers Around the World

Co-founded by U of T alumnus Marc Fiume, DNAstack launched a search engine aimed at the global research community that scans and indexes genomic information about the novel coronavirus.

Press Releases
May 4, 2020

Software Tool Built by U of T Startup Shares Genetic Data with COVID-19 Researchers Around the World

Co-founded by U of T alumnus Marc Fiume, DNAstack launched a search engine aimed at the global research community that scans and indexes genomic information about the novel coronavirus.

Co-founded by U of T alumnus Marc Fiume, DNAstack launched a search engine aimed at the global research community that scans and indexes genomic information about the novel coronavirus.

Press Releases
May 4, 2020

Software Tool Built by U of T Startup Shares Genetic Data with COVID-19 Researchers Around the World

Co-founded by U of T alumnus Marc Fiume, DNAstack launched a search engine aimed at the global research community that scans and indexes genomic information about the novel coronavirus.

Press Releases
May 4, 2020

Software Tool Built by U of T Startup Shares Genetic Data with COVID-19 Researchers Around the World

Co-founded by U of T alumnus Marc Fiume, DNAstack launched a search engine aimed at the global research community that scans and indexes genomic information about the novel coronavirus.

Press Releases
April 20, 2020

Using Innovation to Protect Canadians and Secure Our Economy

Canada's Digital Technology Supercluster is supporting Canadians in the fight against COVID-19. 

Press Releases
April 20, 2020

Using Innovation to Protect Canadians and Secure Our Economy

Canada's Digital Technology Supercluster is supporting Canadians in the fight against COVID-19. 

Canada's Digital Technology Supercluster is supporting Canadians in the fight against COVID-19. 

Press Releases
April 20, 2020

Using Innovation to Protect Canadians and Secure Our Economy

Canada's Digital Technology Supercluster is supporting Canadians in the fight against COVID-19. 

Press Releases
April 20, 2020

Using Innovation to Protect Canadians and Secure Our Economy

Canada's Digital Technology Supercluster is supporting Canadians in the fight against COVID-19. 

Press Releases
March 20, 2020

DNAstack Launches COVID-19 Beacon to Accelerate Sharing Genomic Data in the Fight Against Novel Coronavirus

DNAstack today introduced a Beacon for SARS-CoV-2, commonly known as COVID-19, available at covid-19.dnastack.com.

Press Releases
March 20, 2020

DNAstack Launches COVID-19 Beacon to Accelerate Sharing Genomic Data in the Fight Against Novel Coronavirus

DNAstack today introduced a Beacon for SARS-CoV-2, commonly known as COVID-19, available at covid-19.dnastack.com.

DNAstack today introduced a Beacon for SARS-CoV-2, commonly known as COVID-19, available at covid-19.dnastack.com.

Press Releases
March 20, 2020

DNAstack Launches COVID-19 Beacon to Accelerate Sharing Genomic Data in the Fight Against Novel Coronavirus

DNAstack today introduced a Beacon for SARS-CoV-2, commonly known as COVID-19, available at covid-19.dnastack.com.

Press Releases
March 20, 2020

DNAstack Launches COVID-19 Beacon to Accelerate Sharing Genomic Data in the Fight Against Novel Coronavirus

DNAstack today introduced a Beacon for SARS-CoV-2, commonly known as COVID-19, available at covid-19.dnastack.com.

Press Releases
November 11, 2019

Supercluster Helps DNAstack Keep up with its Ambitions

Toronto company is ‘getting noticed’ as it works to build the digital infrastructure to power the next generation of scientific research.

Press Releases
November 11, 2019

Supercluster Helps DNAstack Keep up with its Ambitions

Toronto company is ‘getting noticed’ as it works to build the digital infrastructure to power the next generation of scientific research.

Toronto company is ‘getting noticed’ as it works to build the digital infrastructure to power the next generation of scientific research.

Press Releases
November 11, 2019

Supercluster Helps DNAstack Keep up with its Ambitions

Toronto company is ‘getting noticed’ as it works to build the digital infrastructure to power the next generation of scientific research.

Press Releases
November 11, 2019

Supercluster Helps DNAstack Keep up with its Ambitions

Toronto company is ‘getting noticed’ as it works to build the digital infrastructure to power the next generation of scientific research.

The  Beacon Network , where new Clinical Evidence Beacons can be searched for crowdsourcing classification of genomic variants.
DNAstack today announced the launch of Clinical Evidence Beacons on the Beacon Network, a real-time search engine for finding genetic mutations across a global network of genomic datasets.These additions will enable medical laboratories to crowdsource the interpretation of variants through a secure social network.Accurately interpreting DNA variants identified through genetic testing is essential for patients and clinicians to make informed medical decisions, for a growing number of medical use cases. While some of those variants can be confidently predicted to be pathogenic or benign based on previous studies and data accessible through variant interpretation resources, in many cases evidence is missing or inconsistent, resulting in conflicting evaluations or reporting as “variants of unknown significance” (VUS). Clinical Evidence Beacons facilitate faster and more consistent variant classifications by securely sharing variant interpretation evidence between collaborating organizations, accelerating the exchange of critical knowledge and improving support for patients affected by genetic diseases and carriers of variants that have an impact on medical decision making. The Beacon Network, where new Clinical Evidence Beacons can be searched for crowdsourcing classification of genomic variants.Building upon the Global Alliance for Genomic and Health (GA4GH) Beacon API, an open standard that allows researchers to determine whether a given variant exists within a genomic dataset, Clinical Evidence Beacons are an extension of the protocol being piloted by DNAstack, allowing uncurated knowledge about a variant to be shared and discovered in real time.“The Beacon API 1.0, which was approved as a GA4GH standard last year, validates the international community’s willingness to work together to define standards and engage in data sharing in a meaningful way,” said Miro Cupak, VP Engineering at DNAstack. “The original protocol was intentionally simple. We’ve since been exploring more powerful derivatives, and learned that by integrating clinical data in the payload of the Beacon that we can help solve outstanding issues faced by the clinical genomics community. Clinical Evidence Beacons, while not formally approved as a GA4GH standard, demonstrate one potential application being designed for future versions of the protocol.”
Miro Cupak, VP Engineering at DNAstackThe first Clinical Evidence Beacons to join the Beacon Network come from the Canadian Open Genetics Repository (COGR), a network of over 20 laboratories that have come together to share information about variants and clinical cases. Members of the COGR are now able to search controlled access Clinical Evidence Beacons for variants of interest on the Beacon Network. A national effort to improve the quality of variant classification has been led by the COGR, who previously published that the percentage of variants with discordant classifications dropped from 26.7% to 14.2% as a result of crowdsourcing, demonstrating the power of collaboration between clinical genomics labs.“Our understanding of genetic data continues to evolve and there is often not a one-to-one correlation between genetic variation and disease, so international or global data sharing efforts are vital to moving the field forward,” said Dr. Jordan Lerner-Ellis, principle investigator of the COGR and Head of Advanced Molecular Diagnostics at Toronto’s Mount Sinai Hospital, Sinai Health System and Associate Professor at the University of Toronto. “Systems that allow for easily accessible real-time data sharing will be increasingly important to be able to provide the most up-to-date information and to translate it into patient care.”While the cost of genome sequencing has decreased significantly, it is still costly. Software that enables valuable biomedical data to be shared will enable future healthcare systems to draw on distributed collections of data in real time, unconstrained by traditional institutional silos and long publication cycles. As we move toward personalized healthcare, there is a need for such systems to integrate genomic information, clinical data, and real-world evidence to better inform treatment decisions.“There is an enormous need to share genomic information and we have seen worldwide interest in the application of Beacons in healthcare environments,” said Jordi Rambla, European Genome-phenome Archive (EGA) Team Lead at the Centre for Genomic Regulation (CRG). “Working with the clinical community, we are pioneering ideas to improve upon Beacon 1.0, and using this knowledge and experience to inform the next version of this standard. Ultimately, Clinical Evidence Beacons could make sharing genomic information, as well as phenotypic data, easier, faster, and more securely than is possible today, accelerating knowledge exchange, diagnoses, and improvements to patient care.” 
Miro Cupak, VP Engineering at DNAstack
Jordi Rambla, European Genome-phenome Archive Team Lead at the Centre for Genomic Regulation
Jordi Rambla, European Genome-phenome Archive Team Lead at the Centre for Genomic RegulationWhile DNAstack’s support for clinical use cases has been developed as extensions to the current Beacon version, an international effort coordinated by the GA4GH Discovery Work Stream and lead by the ELIXIR Beacon Project currently prepares a major upgrade of the GA4GH Beacon protocol. Since the roadmap of the next version includes changes supporting a variety of stakeholder defined biomedical use cases, incorporation of the upcoming Beacon protocol into software supported by DNAstack will accelerate future applications for genomic variant research and discovery.About DNAstackDNAstack’s mission is to improve the lives of millions of people affected by genetic disease by breaking down barriers to data sharing and discovery. DNAstack develops standards and technologies for scientists to more efficiently find, access, and analyze the world’s exponentially growing volumes of genomic and biomedical data.Photo CreditsCape Canaveral Air Force Station, United States. Photo Credit: SpaceX
News
October 14, 2019

DNAstack Launches Clinical Evidence Beacons to Drive Crowdsourcing for Genetic Disease Discovery

Here we describe the Beacon protocol and how it can be used as a model for the federated discovery and sharing of genomic data.

Press Releases
March 4, 2019

Federated Discovery and Sharing of Genomic Data Using Beacons

Press Releases
November 27, 2018

Canada's Digital Technology Supercluster Officially Launches with $153M in Funding from ISED

The Government of Canada is investing up to $950 million over five years to support industry-led innovation superclusters across the country and accelerate economic growth, productivity, and competitiveness across five Superclusters.

Press Releases
November 27, 2018

Canada's Digital Technology Supercluster Officially Launches with $153M in Funding from ISED

The Government of Canada is investing up to $950 million over five years to support industry-led innovation superclusters across the country and accelerate economic growth, productivity, and competitiveness across five Superclusters.

The Government of Canada is investing up to $950 million over five years to support industry-led innovation superclusters across the country and accelerate economic growth, productivity, and competitiveness across five Superclusters.

Press Releases
November 27, 2018

Canada's Digital Technology Supercluster Officially Launches with $153M in Funding from ISED

The Government of Canada is investing up to $950 million over five years to support industry-led innovation superclusters across the country and accelerate economic growth, productivity, and competitiveness across five Superclusters.

Press Releases
November 27, 2018

Canada's Digital Technology Supercluster Officially Launches with $153M in Funding from ISED

The Government of Canada is investing up to $950 million over five years to support industry-led innovation superclusters across the country and accelerate economic growth, productivity, and competitiveness across five Superclusters.

Press Releases
October 24, 2018

Genomic Data Interoperability, Remote Workflow Key to New Global Alliance APIs

Newly released APIs are the first products from the Global Alliance for Genomics and Health's strategic roadmap for interoperability of genomic data.

Press Releases
October 24, 2018

Genomic Data Interoperability, Remote Workflow Key to New Global Alliance APIs

Newly released APIs are the first products from the Global Alliance for Genomics and Health's strategic roadmap for interoperability of genomic data.

Newly released APIs are the first products from the Global Alliance for Genomics and Health's strategic roadmap for interoperability of genomic data.

Press Releases
October 24, 2018

Genomic Data Interoperability, Remote Workflow Key to New Global Alliance APIs

Newly released APIs are the first products from the Global Alliance for Genomics and Health's strategic roadmap for interoperability of genomic data.

Press Releases
October 24, 2018

Genomic Data Interoperability, Remote Workflow Key to New Global Alliance APIs

Newly released APIs are the first products from the Global Alliance for Genomics and Health's strategic roadmap for interoperability of genomic data.

Press Releases
October 11, 2018

ClinGen Advancing Genomic Data‐Sharing Standards as a GA4GH Driver Project

ClinGen has joined with the Global Alliance for Genomics and Health (GA4GH) to support the development of open, freely‐available technical standards and regulatory frameworks for secure and responsible sharing of genomic and health‐related data.

Press Releases
October 11, 2018

ClinGen Advancing Genomic Data‐Sharing Standards as a GA4GH Driver Project

ClinGen has joined with the Global Alliance for Genomics and Health (GA4GH) to support the development of open, freely‐available technical standards and regulatory frameworks for secure and responsible sharing of genomic and health‐related data.

ClinGen has joined with the Global Alliance for Genomics and Health (GA4GH) to support the development of open, freely‐available technical standards and regulatory frameworks for secure and responsible sharing of genomic and health‐related data.

Press Releases
October 11, 2018

ClinGen Advancing Genomic Data‐Sharing Standards as a GA4GH Driver Project

ClinGen has joined with the Global Alliance for Genomics and Health (GA4GH) to support the development of open, freely‐available technical standards and regulatory frameworks for secure and responsible sharing of genomic and health‐related data.

Press Releases
October 11, 2018

ClinGen Advancing Genomic Data‐Sharing Standards as a GA4GH Driver Project

ClinGen has joined with the Global Alliance for Genomics and Health (GA4GH) to support the development of open, freely‐available technical standards and regulatory frameworks for secure and responsible sharing of genomic and health‐related data.

Video
October 4, 2018

Beacon: The Story so Far

Video
October 4, 2018

Beacon: The Story so Far

Video
October 4, 2018

Beacon: The Story so Far

Video
October 4, 2018

Beacon: The Story so Far

Project partners will expand on infrastructure developed by DNAstack for accessing genomic data and explore patient consent models that support nationwide sharing.

Press Releases
August 9, 2018

Canadian Precision Health Infrastructure Emphasizes Secure Data Sharing, Privacy, Consent

Press Releases
August 2, 2018

Registered Access: Authorizing Data Access

The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model—“registered access”—to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research.

Press Releases
August 2, 2018

Registered Access: Authorizing Data Access

The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model—“registered access”—to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research.

The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model—“registered access”—to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research.

Press Releases
August 2, 2018

Registered Access: Authorizing Data Access

The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model—“registered access”—to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research.

Press Releases
August 2, 2018

Registered Access: Authorizing Data Access

The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model—“registered access”—to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research.

DNAstack today announced its participation in a new project to accelerate the development of a national software platform for precision health in Canada.

The project — in which Deloitte, Genome BC, LifeLabs, Microsoft, Molecular You, Provincial Health Services Authority, and the University of British Columbia will also participate — is among the first to be selected and launched as part of Canada’s Digital Technology Supercluster, a federally funded program that recently received over $150M to stimulate the creation of competitive and innovative digital technology solutions for top industries.

With support from the Canadian government, the team is building a powerful new software platform that will make it easier for healthcare organizations, academic researchers, clinical laboratories, pharmaceutical companies, and other innovators to harness exponentially growing volumes of genomic and biomedical data. The platform will help drive new scientific discoveries and inform medical decisions, translating into more personalized and cost-effective healthcare for millions of Canadians.

The DNAstack team
The DNAstack team 

The platform has been designed from the ground up around modern principles of data security, sharing, and analysis, and serves as an alternative path for organizations looking to avoid enormous, ongoing cost burdens associated with purchasing and maintaining local computational infrastructure. The platform is already being piloted with early adopters across the country, where it has proven to be dramatically more powerful, secure, cost-efficient, and accessible compared to other existing solutions. The project team aims to deliver the most advanced platform for precision health in the country, positioning healthcare organizations to roll out new programs that reap significant health and economic benefits for years to come.

“We’re laying the foundation for the future of genomic and biomedical science, where the combination of networked data and powerful technology is used to generate life-saving insights faster than ever before,” said Dr. Marc Fiume, CEO and Co-Founder at DNAstack. “With this platform, we’re empowering scientists to take big data, cloud computing, and machine learning to the fight against the biggest challenges in health.”

 

"We’re empowering scientists to take big data, cloud computing, and machine learning to the fight against the biggest challenges in health." — Marc Fiume, CEO at DNAstack

Dr. Marc Fiume, CEO and Co-Founder of DNAstack
[caption id="" align="aligncenter" width="1400"] Dr. Marc Fiume, CEO and Co-Founder of DNAstack[/caption]

The platform will provide easy to use tools for data producers (e.g. principal investigators, diagnostics laboratories, hospital systems, patient advocacy groups, individuals) to connect and administer the secure sharing of their datasets, and for data consumers (e.g. academic, clinical, pharmaceutical, and industry researchers) to discover and analyze that data using both gold standard and custom applications. Individual users of the platform will be able to perform intense statistical and machine learning analyses with on-demand access to hundreds of thousands of compute cores, more than 10 times the computing power of some of the most equipped research institutions in Canada.

For DNAstack, the project is a continuation of years of global leadership and product innovation in the space. Since 2014, DNAstack has been an active member of the Global Alliance for Genomics & Health (GA4GH), where it contributes to the development of open standards for interoperable data sharing and analysis. This project is integrating key GA4GH protocols for identity, access, discovery, and analysis. In 2018, DNAstack co-founded the Canadian Genomics Cloud, the most computationally powerful public cloud platform for genomics and precision medicine in Canada, which is actively being used by leading scientists across the country to study the genetic causes of autism, adult cancer, pediatric cancer, heart disease, mental health, cystic fibrosis, and other rare diseases. DNAstack is now working in close collaboration with partners of the Digital Technology Supercluster, having diverse and complementary expertise, to introduce entirely new features to the market.

 

Bill Tam, Vice President of Business Development and Partner Relations, Canada’s Digital Technology Supercluster
[caption id="" align="aligncenter" width="300"] Bill Tam, VP of Business Development and Partner Relations, Canada’s Digital Technology Supercluste[/caption]

“We are supporting ambitious opportunities that can’t be tackled by one company alone. Through a collective effort, this project aims to make a global impact and position Canada as a world leader in health,” said Bill Tam, Vice President of Business Development and Partner Relations for Canada’s Digital Technology Supercluster. “We are proud that the Supercluster has created an elevated platform for leading Canadian SMEs like DNAstack to continue to innovate and grow.” — Bill Tam, Vice Presdient of Business Development and Partner Releations, Canada's Digital Technology Supercluster

 

News
August 1, 2018

DNAstack to Co-Develop a National Platform for Precision Health Through Canada’s Digital Technology Supercluster

Press Releases
June 25, 2018

DNAstack and Autism Speaks® Announce Collaboration to Accelerate Scientific Discovery on One of the World's Largest Autism Genome Databases

The Autism Speaks MSSNG project will help researchers answer the many remaining questions about the genetic underpinnings of autism by sequencing the DNA of over 10,000 families affected by autism. 

Press Releases
June 25, 2018

DNAstack and Autism Speaks® Announce Collaboration to Accelerate Scientific Discovery on One of the World's Largest Autism Genome Databases

The Autism Speaks MSSNG project will help researchers answer the many remaining questions about the genetic underpinnings of autism by sequencing the DNA of over 10,000 families affected by autism. 

The Autism Speaks MSSNG project will help researchers answer the many remaining questions about the genetic underpinnings of autism by sequencing the DNA of over 10,000 families affected by autism. 

Press Releases
June 25, 2018

DNAstack and Autism Speaks® Announce Collaboration to Accelerate Scientific Discovery on One of the World's Largest Autism Genome Databases

The Autism Speaks MSSNG project will help researchers answer the many remaining questions about the genetic underpinnings of autism by sequencing the DNA of over 10,000 families affected by autism. 

Press Releases
June 25, 2018

DNAstack and Autism Speaks® Announce Collaboration to Accelerate Scientific Discovery on One of the World's Largest Autism Genome Databases

The Autism Speaks MSSNG project will help researchers answer the many remaining questions about the genetic underpinnings of autism by sequencing the DNA of over 10,000 families affected by autism. 

Press Releases
March 14, 2018

Simplifying Research Access to Genomics and Health Data with Library Cards

The volume of genomics and health data is growing rapidly, driven by sequencing for both research and clinical use.

Press Releases
March 14, 2018

Simplifying Research Access to Genomics and Health Data with Library Cards

The volume of genomics and health data is growing rapidly, driven by sequencing for both research and clinical use.

The volume of genomics and health data is growing rapidly, driven by sequencing for both research and clinical use.

Press Releases
March 14, 2018

Simplifying Research Access to Genomics and Health Data with Library Cards

The volume of genomics and health data is growing rapidly, driven by sequencing for both research and clinical use.

Press Releases
March 14, 2018

Simplifying Research Access to Genomics and Health Data with Library Cards

The volume of genomics and health data is growing rapidly, driven by sequencing for both research and clinical use.

Press Releases
February 26, 2018

GA4GH Releases 2018 Strategic Roadmap

The Global Alliance for Genomics and Health (GA4GH) has announced their Strategic Roadmap, which includes a series of more than two dozen deliverables to be launched in 2018 and developed over the next one to three years. 

Press Releases
February 26, 2018

GA4GH Releases 2018 Strategic Roadmap

The Global Alliance for Genomics and Health (GA4GH) has announced their Strategic Roadmap, which includes a series of more than two dozen deliverables to be launched in 2018 and developed over the next one to three years. 

The Global Alliance for Genomics and Health (GA4GH) has announced their Strategic Roadmap, which includes a series of more than two dozen deliverables to be launched in 2018 and developed over the next one to three years. 

Press Releases
February 26, 2018

GA4GH Releases 2018 Strategic Roadmap

The Global Alliance for Genomics and Health (GA4GH) has announced their Strategic Roadmap, which includes a series of more than two dozen deliverables to be launched in 2018 and developed over the next one to three years. 

Press Releases
February 26, 2018

GA4GH Releases 2018 Strategic Roadmap

The Global Alliance for Genomics and Health (GA4GH) has announced their Strategic Roadmap, which includes a series of more than two dozen deliverables to be launched in 2018 and developed over the next one to three years. 

Press Releases
February 18, 2018

Canadian Genomics Cloud to Develop GA4GH Compliant Precision Medicine Platform

DNAstack, Canada’s Genomics Enterprise, Google, the Centre of Genomics and Policy, and more announce the launch of the Canadian Genomics Cloud (CGC): a national cloud-based infrastructure for genomics initiatives to share data across Canada.

Press Releases
February 18, 2018

Canadian Genomics Cloud to Develop GA4GH Compliant Precision Medicine Platform

DNAstack, Canada’s Genomics Enterprise, Google, the Centre of Genomics and Policy, and more announce the launch of the Canadian Genomics Cloud (CGC): a national cloud-based infrastructure for genomics initiatives to share data across Canada.

DNAstack, Canada’s Genomics Enterprise, Google, the Centre of Genomics and Policy, and more announce the launch of the Canadian Genomics Cloud (CGC): a national cloud-based infrastructure for genomics initiatives to share data across Canada.

Press Releases
February 18, 2018

Canadian Genomics Cloud to Develop GA4GH Compliant Precision Medicine Platform

DNAstack, Canada’s Genomics Enterprise, Google, the Centre of Genomics and Policy, and more announce the launch of the Canadian Genomics Cloud (CGC): a national cloud-based infrastructure for genomics initiatives to share data across Canada.

Press Releases
February 18, 2018

Canadian Genomics Cloud to Develop GA4GH Compliant Precision Medicine Platform

DNAstack, Canada’s Genomics Enterprise, Google, the Centre of Genomics and Policy, and more announce the launch of the Canadian Genomics Cloud (CGC): a national cloud-based infrastructure for genomics initiatives to share data across Canada.

Press Releases
February 9, 2018

Health and the Genome Puzzle: Mapping DNA Has Gotten Cheaper, But Do We Know How to Use the Data?

Not only was Michael Szego the ethics lead on the Personal Genome Project Canada — he was also a participant, agreeing to have his genome mapped and shared publicly. 

Press Releases
February 9, 2018

Health and the Genome Puzzle: Mapping DNA Has Gotten Cheaper, But Do We Know How to Use the Data?

Not only was Michael Szego the ethics lead on the Personal Genome Project Canada — he was also a participant, agreeing to have his genome mapped and shared publicly. 

Not only was Michael Szego the ethics lead on the Personal Genome Project Canada — he was also a participant, agreeing to have his genome mapped and shared publicly. 

Press Releases
February 9, 2018

Health and the Genome Puzzle: Mapping DNA Has Gotten Cheaper, But Do We Know How to Use the Data?

Not only was Michael Szego the ethics lead on the Personal Genome Project Canada — he was also a participant, agreeing to have his genome mapped and shared publicly. 

Press Releases
February 9, 2018

Health and the Genome Puzzle: Mapping DNA Has Gotten Cheaper, But Do We Know How to Use the Data?

Not only was Michael Szego the ethics lead on the Personal Genome Project Canada — he was also a participant, agreeing to have his genome mapped and shared publicly. 

Press Releases
February 5, 2018

The Personal Genome Project Canada: Findings from Whole Genome Sequences of the Inaugural 56 Participants

The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. 

Press Releases
February 5, 2018

The Personal Genome Project Canada: Findings from Whole Genome Sequences of the Inaugural 56 Participants

The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. 

The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. 

Press Releases
February 5, 2018

The Personal Genome Project Canada: Findings from Whole Genome Sequences of the Inaugural 56 Participants

The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. 

Press Releases
February 5, 2018

The Personal Genome Project Canada: Findings from Whole Genome Sequences of the Inaugural 56 Participants

The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. 

News
November 14, 2017

DNAstack Inks Partnership with Sentieon to Offer Faster, Cheaper, More Consistent Bioinformatics in the Cloud

DNAstack, a cloud genomics company, today is announcing a partnership with Sentieon, an award-winning bioinformatics software company. 

News
November 14, 2017

DNAstack Inks Partnership with Sentieon to Offer Faster, Cheaper, More Consistent Bioinformatics in the Cloud

DNAstack, a cloud genomics company, today is announcing a partnership with Sentieon, an award-winning bioinformatics software company. 

DNAstack, a cloud genomics company, today is announcing a partnership with Sentieon, an award-winning bioinformatics software company. 

Through this partnership, a suite of Sentieon’s algorithms will be made available for running through DNAstack’s Workflows app to deliver genomics data analyses pipelines in the cloud that are faster, cheaper, without downsampling, and 100% consistent while using identical mathematics as the industry standard best practice workflows.

Sentieon technologies won precision FDA’s Consistency Challenge as Top Overall Performance and Highest Reproducibility, and Truth Challenge for highest SNP Recall and INDEL Precision. In the most recent precisionFDA Hidden Treasures-Warm Up challenge, along with 36 other submissions, all Sentieons’ submissions caught all injected variants while using default parameters without any special filtering. Last year, Sentieon’s TNscope toolset also ranked #1 in the ICGC-TCGA DREAM Challenge for Somatic Mutation Calling for all 3 categories SNV, Indel, SV. “At DNAstack, our mission is to democratize access to genomics data and best-in-class technologies to analyze it at scale,” said Dr. Marc Fiume, CEO of DNAstack. “The addition of Sentieon’s algorithms to our Workflows marketplace lets anyone with an internet connection run their award-winning software simply and at any scale. From there, they can interpret results privately in the context of the large and growing global network on our platform.” Fiume also co-leads the Discovery Workstream of the Global Alliance for Genomics & Health, which develops industry standards for sharing of genomics data, tools, and services.

Sentieon’s DNAseq pipeline for germline FASTQ-to-VCF running on DNAstack takes around 5 hours and costs less than $15 for a 30X coverage Whole Genome Sequence. “Through the integration with DNAstack, Sentieon technologies will be made more accessible to a global community of scientists to help accelerate breakthrough discoveries and the implementation of precision medicine,” said Dr. Jun Ye, Sentieon’s CEO. “We look forward to leveraging the power and efficiencies of DNAstack’s platform to deliver Sentieon’s accurate and fast tools for read alignment and variant calling to DNAstack’s customers”

“This can serve a very large user base that needs a simple and cost-effective ‘sequencer-to-scientist’ solution,” said Fiume. “Especially as genomics becomes increasingly integrated with clinical care, we see tremendous long term value in having high-speed, low-cost, no downsampling, and 100% reproducible solutions for data analysis.”

About DNAstack

DNAstack develops a cloud-based platform for genomics data analysis and sharing. Through collaborations with Google, Broad Institute, and the Global Alliance for Genomics & Health, DNAstack provides push-button access to state-of-the-art technologies to help researchers, clinical laboratories, and pharmaceutical companies more quickly and cost-effectively make sense of the world’s exponentially accumulating genomics data and break down barriers to data sharing.

Direct any questions to info@dnastack.com.

About Sentieon

Sentieon develops highly optimized and accurate algorithms for bioinformatics applications, using the team’s expertise in algorithm, software, and system optimization. Sentieon is a team of professionals using accumulated expertise in modeling, optimization, machine learning, and high-performance computing, to enable precision data for precision medicine.

News
November 14, 2017

DNAstack Inks Partnership with Sentieon to Offer Faster, Cheaper, More Consistent Bioinformatics in the Cloud

DNAstack, a cloud genomics company, today is announcing a partnership with Sentieon, an award-winning bioinformatics software company. 

News
November 14, 2017

DNAstack Inks Partnership with Sentieon to Offer Faster, Cheaper, More Consistent Bioinformatics in the Cloud

DNAstack, a cloud genomics company, today is announcing a partnership with Sentieon, an award-winning bioinformatics software company. 

News
September 25, 2017

Crowd-Sourcing Variant Interpretations to Improve Patient Outcomes

We fundamentally believe that democratization of genomics information through sharing will massively accelerate discoveries that will lead to better treatments and outcomes for patients affected by genetic diseases. 

We can avoid long diagnostic odyssies by connecting patients’ data into networks that leverage — in realtime — large and exponentially growing volumes of information. This is the inspiration behind our work on the Beacon Project for the Global Alliance for Genomics & Health, where we are defining industry standards for sharing genomic variants.

News
September 25, 2017

Crowd-Sourcing Variant Interpretations to Improve Patient Outcomes

We fundamentally believe that democratization of genomics information through sharing will massively accelerate discoveries that will lead to better treatments and outcomes for patients affected by genetic diseases. 

We can avoid long diagnostic odyssies by connecting patients’ data into networks that leverage — in realtime — large and exponentially growing volumes of information. This is the inspiration behind our work on the Beacon Project for the Global Alliance for Genomics & Health, where we are defining industry standards for sharing genomic variants.

We fundamentally believe that democratization of genomics information through sharing will massively accelerate discoveries that will lead to better treatments and outcomes for patients affected by genetic diseases. 

We can avoid long diagnostic odyssies by connecting patients’ data into networks that leverage — in realtime — large and exponentially growing volumes of information. This is the inspiration behind our work on the Beacon Project for the Global Alliance for Genomics & Health, where we are defining industry standards for sharing genomic variants.

Beyond sharing variants themselves, though, it is important to share their classifications — i.e. whether a variant causes disease — because while there are standards and guidelines for interpretation of genetic variants, it is common for different laboratories to classify them differently. One laboratory may consider a variant to be pathogenic while another may consider it benign, leading to discrepent diagnoses and treatment options between patients studied at different clinics.

The purpose of the Canadian Open Genetics Repository, lead by Dr. Jordan Lerner-Ellis, Director of the Advanced Molecular Diagnostics Laboratory at Mount Sinai Hospital, and by Dr. Matthew Lebo, Associate Laboratory Director for Laboratory for Molecular Medicine at Partners Healthcare, was to develop a national program for Canadian genetic diagnostic laboratories to crowd-source variant interpretations and resolve discordant classifications for variants.

The study included over 20 laboratories who classified over 5,000 variant observations in these genes. Using a five-tier classification model, 38.9% of variants were discordant between laboratories; with a three-tier model, 26.7% were discordant. After crowd-sourcing variant classifications — and supporting evidence — discordance decreased to 30.7% under the five-tier model and 14.2% under the three-tier model. This study demonstrates the power of a crowd-sourcing platform to increase the level of consensus in variant classifications.

Learn More

The DNAstack platform is being used to share variants and their classifications as a public resource on opengenetics.ca and as a Beacon on the Beacon Network. The work was recently published in the pages of Genetics in Medicine and the full text is available for download here.

News
September 25, 2017

Crowd-Sourcing Variant Interpretations to Improve Patient Outcomes

We fundamentally believe that democratization of genomics information through sharing will massively accelerate discoveries that will lead to better treatments and outcomes for patients affected by genetic diseases. 

We can avoid long diagnostic odyssies by connecting patients’ data into networks that leverage — in realtime — large and exponentially growing volumes of information. This is the inspiration behind our work on the Beacon Project for the Global Alliance for Genomics & Health, where we are defining industry standards for sharing genomic variants.

News
September 25, 2017

Crowd-Sourcing Variant Interpretations to Improve Patient Outcomes

We fundamentally believe that democratization of genomics information through sharing will massively accelerate discoveries that will lead to better treatments and outcomes for patients affected by genetic diseases. 

We can avoid long diagnostic odyssies by connecting patients’ data into networks that leverage — in realtime — large and exponentially growing volumes of information. This is the inspiration behind our work on the Beacon Project for the Global Alliance for Genomics & Health, where we are defining industry standards for sharing genomic variants.

News
August 3, 2017

Introducing Workflows, the New Standard in Cloud Bioinformatics

Today, DNAstack announced the launch of a major update to its cloud-based genomics software platform. 

The update contains many new features, including a revolutionary application called Workflows that makes it easier and more affordable than ever for genome scientists to find, develop, share, and run bioinformatics workflows at scale. The app brings the most advanced technologies and standards for bioinformatics to market and positions DNAstack as a commercial leader in cloud genomics.

News
August 3, 2017

Introducing Workflows, the New Standard in Cloud Bioinformatics

Today, DNAstack announced the launch of a major update to its cloud-based genomics software platform. 

The update contains many new features, including a revolutionary application called Workflows that makes it easier and more affordable than ever for genome scientists to find, develop, share, and run bioinformatics workflows at scale. The app brings the most advanced technologies and standards for bioinformatics to market and positions DNAstack as a commercial leader in cloud genomics.

Today, DNAstack announced the launch of a major update to its cloud-based genomics software platform. 

The update contains many new features, including a revolutionary application called Workflows that makes it easier and more affordable than ever for genome scientists to find, develop, share, and run bioinformatics workflows at scale. The app brings the most advanced technologies and standards for bioinformatics to market and positions DNAstack as a commercial leader in cloud genomics.

 

1*4DpoaHKkHVqT6Wc7vQXVQQ.png

 “Our mission is to improve and save lives by democratizing access to genomics data and the infrastructure needed to analyze it at scale,” said Dr. Marc Fiume, the company’s CEO and co-founder. Dr. Fiume also co-chairs the Beacon Project, an international data sharing initiative as part of the Global Alliance for Genomics & Health (GA4GH).

“With the release of Workflows, we’re making the latest technologies in genomics broadly accessible and fostering standards to promote open science, reproducibility, sharing, and collaboration.” — Marc Fiume, CEO at DNAstack

DNAstack is increasing global access to bioinformatics by offering its platform without a subscription fee for research use and competitive pay-per-use pricing for data storage and computation.

Users of the Workflows App can choose from a growing selection of high quality, curated pipelines like Broad Institute’s GATK Best Practices, a popular end-to-end genotyping toolkit, or author their own custom analyses. The Workflows App runs pipelines written in the Workflow Description Language (WDL), a language developed at the Broad Institute that is being adopted as a community standard. “WDL enables analysts to create and understand pipelines with as little friction as possible,” said Jeff Gentry, Senior Principal Software Engineer at the Broad Institute and co-chair of the Containers & Workflows Task Team at GA4GH.

“The ability to execute those pipelines in different environments, like DNAstack, is critical to the success of the whole ecosystem.” — Jeff Gentry, Broad Institute

DNAstack’s application includes a WDL editor for authoring workflows that gives bioinformaticians and software developers a unified platform for developing, refining, and running their pipelines in the cloud.

DNAstack has collaborated with Ontario Institute of Cancer Research and University of California, Santa Cruz to connect the Workflows App to Dockstore, a repository of tools and workflows shared by members of the community. Now, users can browse the catalog of pipelines available on Dockstore and click to deploy them on their own data in DNAstack. “Consistent and coordinated processing of millions of samples collected from around the world will be needed to resolve complex diseases like cancer,” said Dr. Brian O’Connor, who helped create Dockstore. Dr. O’Connor is the Technical Director of the Computational Genomics Platform UC Santa Cruz and co-chair of the GA4GH Containers & Workflows Task Team. “Dockstore is a resource for sharing Docker-based workflow definitions, and the integration with DNAstack brings exciting new efficiencies and scale with which they can be run.” DNAstack is the first system to streamline execution of workflows from Dockstore.

DNAstack uses Google Cloud Platform to distribute the execution of computationally intensive tasks across tens of thousands of compute cores, on-demand. DNAstack has massive throughput, with the ability to process over a quarter million whole human genome sequences per year. DNAstack has recently become a Google Cloud Technology Partner, bringing deeper levels of integration between the two platforms.

Workflows is already being used around the world to accelerate research in cancer, autism, and rare disease. Academic and clinical researchers at The Hospital for Sick Children in Toronto have used the application to run thousands of workflows on DNAstack. As organizations continue to adopt whole genome sequencing as the new standard in genomics medicine, the new addition of Workflows to DNAstack’s platform provides a simple, cost-effective, and powerful path towards their implementation of precision medicine in research and point of care.

The Workflows App is available to all DNAstack users. 

About DNAstack

DNAstack develops a cloud-based platform for genomics data analysis and sharing. Through collaborations with Google, Broad Institute, and the Global Alliance for Genomics & Health, DNAstack provides push-button access to state-of-the-art technologies to help researchers, clinical laboratories, and pharmaceutical companies more quickly and cost-effectively make sense of the world’s exponentially accumulating genomics data and break down barriers to data sharing.

Photo Credits

Feature photo by Michał Kubalczyk

News
August 3, 2017

Introducing Workflows, the New Standard in Cloud Bioinformatics

Today, DNAstack announced the launch of a major update to its cloud-based genomics software platform. 

The update contains many new features, including a revolutionary application called Workflows that makes it easier and more affordable than ever for genome scientists to find, develop, share, and run bioinformatics workflows at scale. The app brings the most advanced technologies and standards for bioinformatics to market and positions DNAstack as a commercial leader in cloud genomics.

News
August 3, 2017

Introducing Workflows, the New Standard in Cloud Bioinformatics

Today, DNAstack announced the launch of a major update to its cloud-based genomics software platform. 

The update contains many new features, including a revolutionary application called Workflows that makes it easier and more affordable than ever for genome scientists to find, develop, share, and run bioinformatics workflows at scale. The app brings the most advanced technologies and standards for bioinformatics to market and positions DNAstack as a commercial leader in cloud genomics.

Press Releases
July 20, 2017

Data Sharing as a National Quality Improvement Program: Reporting on BRCA1 and BRCA2 Variant-Interpretation Comparisons Through the Canadian Open Genetics Repository (COGR)

The purpose of this study was to develop a national program for Canadian diagnostic laboratories to compare DNA-variant interpretations and resolve discordant-variant classifications using the BRCA1 and BRCA2 genes as a case study.

Press Releases
July 20, 2017

Data Sharing as a National Quality Improvement Program: Reporting on BRCA1 and BRCA2 Variant-Interpretation Comparisons Through the Canadian Open Genetics Repository (COGR)

The purpose of this study was to develop a national program for Canadian diagnostic laboratories to compare DNA-variant interpretations and resolve discordant-variant classifications using the BRCA1 and BRCA2 genes as a case study.

The purpose of this study was to develop a national program for Canadian diagnostic laboratories to compare DNA-variant interpretations and resolve discordant-variant classifications using the BRCA1 and BRCA2 genes as a case study.

Press Releases
July 20, 2017

Data Sharing as a National Quality Improvement Program: Reporting on BRCA1 and BRCA2 Variant-Interpretation Comparisons Through the Canadian Open Genetics Repository (COGR)

The purpose of this study was to develop a national program for Canadian diagnostic laboratories to compare DNA-variant interpretations and resolve discordant-variant classifications using the BRCA1 and BRCA2 genes as a case study.

Press Releases
July 20, 2017

Data Sharing as a National Quality Improvement Program: Reporting on BRCA1 and BRCA2 Variant-Interpretation Comparisons Through the Canadian Open Genetics Repository (COGR)

The purpose of this study was to develop a national program for Canadian diagnostic laboratories to compare DNA-variant interpretations and resolve discordant-variant classifications using the BRCA1 and BRCA2 genes as a case study.

News
February 15, 2017

Beacon Goes Global

Almost sixty percent of the human population resides in Asia and Africa, but only a fraction of the world’s human genomic sequencing efforts cover that community.

News
February 15, 2017

Beacon Goes Global

Almost sixty percent of the human population resides in Asia and Africa, but only a fraction of the world’s human genomic sequencing efforts cover that community.

Almost sixty percent of the human population resides in Asia and Africa, but only a fraction of the world’s human genomic sequencing efforts cover that community.

News
February 15, 2017

Beacon Goes Global

Almost sixty percent of the human population resides in Asia and Africa, but only a fraction of the world’s human genomic sequencing efforts cover that community.

News
February 15, 2017

Beacon Goes Global

Almost sixty percent of the human population resides in Asia and Africa, but only a fraction of the world’s human genomic sequencing efforts cover that community.