How Artificial Intelligence Is Redefining RNA Virus Discovery

Metagenomic analysis of RNA viruses is currently only useful for detecting those with sequence similarities to known viruses, making it difficult to find the highly various viruses that make up a significant portion of the “dark matter” in our virosphere.

LucaProt, a deep learning algorithm we developed, was used to find highly distinct RNA-dependent RNA polymerase (RdRP) sequences in 10,487 meta-transcriptomes worldwide. This approach combined sequence and structure characteristics to detect RdRP sequences while being time efficient precisely. Through this method, we discovered 180,571 species of RNA viruses and 180 major viral phyla/classes.

The widest selection of RNA viruses has been identified, a few of which cannot be detected using BLAST or HMM techniques. These newly found RNA viruses were located in various habitats, such as the atmosphere, hot springs, and hydrothermal vents. The number and variety of these viruses varied significantly between different ecosystems.

This research has identified the longest RNA virus genome, with 47,250 nucleotides. Furthermore, this study has increased the diversity of RNA bacteriophage to more than ten species/classes. This breakthrough marks the dawn of a new era in virus discovery. It could potentially revolutionize our knowledge of the global virosphere and revise our comprehension of the evolution of viruses.


In our earlier study of the marine sediments of Aarhus Bay, several organohalide-respiring bacteria (OHRB) were identified in the metagenome-assembled genomes (MAGs). However, we still need to understand their roles and interactions. Therefore, acquiring pure cultures or more precise consortia to conduct further eco-physiological research would be immensely helpful.

We removed a group of microorganisms from anaerobic slant tube culture inoculated with durable PCE dehalogenation increase to achieve this purpose. Remarkably, the produced society showed debromination solely, instead of PCE dechlorination, when exposed to a sulfate-reducing atmosphere. Society could keep energy for its growth via debromination of 2,6-dibromo phenol (2,6-DBP).

Analysis of 16S rRNA gene sequence data derived from shotgun metagenome sequences indicated that a Desulforhopalus strain was the most frequent component in the consortium, with a relative abundance of 29%. In addition, five bins (completeness > 85% and contamination < 3%) were constructed. All these bins were thought to be representing potentially new species (average nucleotide identity, ANI < 95%).

Two bins, bin.3 from Desulfoplanes and bin.4 from Marinifilaceae, were discovered to hold genes that code for reductive dehalogenase (RDase). Bin.5 was found to have a gene encoding thiolytic tetrachloro-p-hydroquinone (TPh-) RDase with 23.4 % similarity to TPh-RDase of Sphingobium chlorophenolicum.

After adding 2,6-DBP, the expression of all three RDase genes was significantly increased. Acetylene, an inhibitor for some redox-active metalloenzymes, limited methanogenesis, and reductive dehalogenation without changing gene expression, indicating that the inhibition is post-transcriptional.

Results from phylogenomic studies highlighted the significant role of community members in ecology, including the capability to create vitamin B12 autonomously. Physiological data also supported this evidence. Collectively, this information gives us a greater understanding of the mutualism of the consortium and offers potential methods to enhance bioremediation through synthetic OHR communities.

Integrating AI in RNA virus discovery has opened up new avenues for research. It has the potential to significantly improve our ability to detect, prevent, and manage viral infections. As technology advances, AI will likely play an increasingly important role in the fight against infectious diseases, making it an exciting time for virology.

Source: bioRxiv

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top