4bases’paper of The Month – January 2024 – Benchmarking and improving the performance of variant-calling pipelines with RecallME

In the realm of DNA sequencing, both next-generation sequencing (NGS) and third-generation sequencing (TGS) have revolutionized genomics. While NGS dominates current research and clinical applications, TGS is an area of active exploration. The evolution of longer-read sequencing resolves computational hurdles in critical biological and medical domains such as genome assembly, transcript reconstruction, and metagenomics.

Variant calling, the process of identifying genetic variants from sequencing data, requires aligning experimental results with a reference. However, disparate formats, reference genomes, and nomenclatures often lead to inconsistencies in variant representation. Harmonizing variants becomes imperative to ensure accurate labeling and interpretation in experimental datasets.

Moreover, optimizing variant calling pipelines involves adjusting calling parameters and thresholds for quality metrics.While minimizing false positives is preferred in large-scale studies, in diagnostic scenarios, avoiding false negatives is imperative due to potential severe implications. In this case, false positives can be flagged and rectified through subsequent sequencing rounds. False positives and negatives can arise from errors in sequencing or bioinformatic pipelines. By discerning the errors, researchers and clinicians gain insights crucial for enhancing the reliability and precision of genomic analyses, empowering more accurate interpretations and informed decision-making.

Presently, the primary benchmarking pipelines present limitations. They lack quality parameter information for identified false negatives and positives. Furthermore, their limited implementation in certain sequencing platforms, like Ion Torrent-based (ION) sequencing prevalent in clinical settings, poses a challenge. Indeed, this technology shows heterogeneity in variant annotation and some limitations in detecting insertion and deletion variants that may lead to incorrect estimates of the actual analytical performance. Finally, as TGS technologies gain traction, especially in diagnostics, the need for adept benchmarking tools amplifies. Although they excel in resolving repetitive regions, detecting small variants remains a hurdle.

The paper of the month presents RecallME, a novel software addressing variant annotation standardization, rapid performance metric quantification in NGS/TGS-based VC pipelines, and error discrimination between sequencing and bioinformatics. Its robust functionalities guide users in optimizing pipelines efficiently.

In a landmark study, RecallME’s efficacy was demonstrated thanks to sequencing datasets. Among other things, the authors analyzed genomic DNA from a BRCA Germline Reference Standard using our HEVA pro kit—a powerful tool for identifying mutations in genes linked to breast and ovarian cancer, familial adenomatous polyposis, and hereditary nonpolyposis colorectal cancer. They leveraged the MinIon, developed by our partner Oxford Nanopore Technologies, to sequence this DNA, showcasing RecallME’s performance in TGS environments.

RecallME emerges as a transformative solution in the landscape of variant calling tools, filling critical gaps in standardization, performance evaluation, and error discrimination across sequencing platforms. We at 4bases are proud that our collaboration with Oxford Nanopore Technologies (ONT) contributes to the development of such a powerful and pivotal technology by combining ONT’s pioneering nanopore sequencing devices and our 4bases kits.

Source: Vozza, G., Bonetti, E., Tini, G., Favalli, V., Frigè, G., Bucci, G., De Summa, S., Zanfardino, M., Zapelloni, F., & Mazzarella, L. (2023). Benchmarking and improving the performance of variant-calling pipelines with ReCallME. Bioinformatics, 39(12). https://doi.org/10.1093/bioinformatics/btad722

Share:

More Posts