SARS-CoV-2 is a novel severe acute respiratory syndrome-like coronavirus (SARS-CoV), which is responsible of the ongoing world pandemic of COVID-19 disease. Although many approaches are being investigated to address this issue, nowaday there are no vaccines available and there is little evidence supporting the efficiency of potential therapeutic agents. Moreover, the high mutation rate of this virus heavily affects the understanding of its evolution and diffusion mechanisms, and, in turn, the development of effective solutions. In this study, two novel algorithms are provided for finding out recurrent patterns of nucleotide subsequences of different SARS-CoV-2 genomes as a unique signature capable of identifying the most peculiar features of the pathogen. In particular, we provide several subsequence patterns related to the Spike glycoprotein, which is believed to be the main target for developing effective drugs and vaccines against the COVID-19 disease because of its role in the entrance of coronaviruses into host cells. The experimental results, obtained by analyzing 5000 genomes of SARS-CoV-2, have shown that the extracted patterns are able to recognize the Spyke protein in the 99.35% of the considered genomes. In addition, such patterns have proven to be highly discriminating with respect to other pathogenic genomes, such as SARS, Middle East respiratory syndrome, Nipah, and the streptococcus bacteria. We hope that the findings presented in this study can help specialists in speeding up the design of more accurate drugs or vaccines against SARS-CoV-2.
Discovering genomic patterns in SARS-CoV-2 variants
D'Angelo G.
;Palmieri F.
2020-01-01
Abstract
SARS-CoV-2 is a novel severe acute respiratory syndrome-like coronavirus (SARS-CoV), which is responsible of the ongoing world pandemic of COVID-19 disease. Although many approaches are being investigated to address this issue, nowaday there are no vaccines available and there is little evidence supporting the efficiency of potential therapeutic agents. Moreover, the high mutation rate of this virus heavily affects the understanding of its evolution and diffusion mechanisms, and, in turn, the development of effective solutions. In this study, two novel algorithms are provided for finding out recurrent patterns of nucleotide subsequences of different SARS-CoV-2 genomes as a unique signature capable of identifying the most peculiar features of the pathogen. In particular, we provide several subsequence patterns related to the Spike glycoprotein, which is believed to be the main target for developing effective drugs and vaccines against the COVID-19 disease because of its role in the entrance of coronaviruses into host cells. The experimental results, obtained by analyzing 5000 genomes of SARS-CoV-2, have shown that the extracted patterns are able to recognize the Spyke protein in the 99.35% of the considered genomes. In addition, such patterns have proven to be highly discriminating with respect to other pathogenic genomes, such as SARS, Middle East respiratory syndrome, Nipah, and the streptococcus bacteria. We hope that the findings presented in this study can help specialists in speeding up the design of more accurate drugs or vaccines against SARS-CoV-2.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.