Skip to content

Retrovirus Projects

Robert J. Gifford edited this page Nov 30, 2024 · 74 revisions

Overview

Retroviruses are a unique group of RNA viruses that replicate through a process of reverse transcription, integrating their genetic material into the host genome. This integration allows them to persist as proviruses, blurring the line between infectious agents and endogenous elements within the host's genetic material. Retroviruses, such as human immunodeficiency virus (HIV) and human T-lymphotropic virus (HTLV), are best known for their role in causing significant human diseases. However, their study extends beyond traditional virology, as they exist at the interface of infectious agents and genetic elements, reflecting their dual nature as both viruses and molecular components of host genomes. Their capacity for long-term integration enables retroviruses to drive host species evolution, influencing gene expression and genomic architecture.

This interface between infection and host biology positions retroviruses as a focal point not only in the study of viral pathogenesis but also in understanding fundamental biological processes. Endogenous retroviruses (ERVs), which are the remnants of ancient retroviral infections, comprise a significant portion of vertebrate genomes, contributing to host evolution and playing important roles in immunity, reproduction, and gene regulation. Retroviral research bridges virology and whole-organism biology, providing insights into viral-host co-evolution, the origins of genomic innovations, and the mechanisms underlying viral persistence and immune evasion. The study of retroviruses, therefore, encompasses both their role as pathogens and as long-term genomic elements, making them critical to fields ranging from molecular evolution to cancer biology and gene therapy



Contents

GLUE projects developed for retroviruses in the Gifford Lab:



RVdb

Background

RVdb was developed to support the implementation of retrovirus taxonony in association with Retrovirus Study Group of the International Committee on Taxonomy of Viruses (ICTV).

The ICTV is the global authority responsible for the classification and naming of viruses. Established in the early 1960s, the ICTV is part of the International Union of Microbiological Societies (IUMS). Its primary objectives are to create a universally accepted virus taxonomy, define virus species and higher taxonomic ranks, and establish standardized virus names. ICTV’s work is essential for organizing the vast diversity of viruses into a coherent classification, aiding in communication and understanding across the virology community.

Scope and History

RVdb collates information about retrovirus taxonomy, along with sequences, alignments and phylogenies. Via GLUE, phylogenetic reconstructions used as the basis for taxonomic classification are made accessible and re-usable.

Features

  • Comprehensive Reference Sequence Set: Incorporates reference sequence data for all retrovirus species recognised by the ICTV

  • Integrates ICTV and NCBI reference data: Integrates NCBI retrovirus reference sequences with ICTV species.

  • RT Alignment: Incorporates a codon-based RT alignment that covers all retrovirus species and can be edited in a version-controlled way (via GitHub).

  • Reproducible RT Phylogeny: GLUE implements a reproducible process for building the reverse transcriptase (RT) phylogenies on which retroviral taxonomy is based.

Core Project Overview

Property Description
Scope Retroviruses (family Retroviridae (exogenous viruses only))
Development Period 2024-present
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI
Associated Tools BLAST+, MAFFT, RAXML
Offline Project GitHub
Online Access None as yet
Status Mature. Actively being developed
User Guide None Yet

Extension Layers

  • ERVdb: technically, ERVdb is an extension of RVdb. However, it is discussed as a separate project below, due to its extensive scope and role as the foundation for a broad range of ERV resources. RVdb provides the exogenous retrovirus component of ERVdb.


ERVdb

Background

A defining feature of retroviruses is their unique replication strategy, which involves the reverse transcription of the viral RNA genome into DNA and its integration into the host cell's nuclear genome as a "provirus." While most retroviral infections occur in somatic cells, occasional infections of germline cells---such as sperm, eggs, or early embryos---result in the viral DNA being passed down through generations as part of the host genome. These inherited sequences are known as endogenous retroviruses (ERVs).

Once incorporated into the germline, ERVs can expand within the host genome through processes such as reinfection of germline cells and retrotransposition. This proliferation often results in multi-copy ERV lineages, with tens to thousands of related sequences scattered across the genome. Although many ERV insertions are lost over time due to genetic drift or purifying selection, some become fixed within populations, leaving a lasting genomic footprint. Today, ERVs make up an estimated 5-10% of vertebrate genomes, offering an unparalleled molecular fossil record of the long-term evolutionary interplay between retroviruses and their hosts.

Beyond their evolutionary significance, ERVs have played a key role in shaping vertebrate genomes and host physiology. For instance, ERVs have been implicated in critical processes such as placentation, antiviral immunity, and the regulation of gene expression. Their dual role as genomic fossils and functional genomic elements makes ERVs a unique subject of study, offering insights into both host-virus co-evolution and the evolutionary innovations driven by viral sequences.

Scope and History

Retroviruses are a long-standing research focus in the Gifford Lab, and the motivation for developing GLUE was in large part driven by the need for a systematic approach to organizing and analyzing ERV data. Recognizing the overlap between the techniques required for ERV studies and those used in broader viral analyses, such as genomic epidemiology, informed the design of GLUE.

By fostering a flexible framework for shared methodologies, ERVdb aims to facilitate the collaborative study of endogenous retroviruses across the many different analysis contexts in which they are relevant.

Features

  • Comprehensive Reference Sequence Set: Incorporates reference sequence data for diverse retroviruses, including exogenous retroviruses and representative endogenous retroviruses (ERVs).

  • RT Alignment: Features a codon-based alignment of reverse transcriptase (RT) sequences, spanning a wide diversity of retroviruses and ERV lineages. This alignment is maintained in a version-controlled manner via GitHub, enabling reproducible updates and edits.

  • Reproducible RT Phylogeny: Implements a reproducible workflow in GLUE for constructing phylogenies of reverse transcriptase (RT) sequences, forming the basis for exploring retrovirus and ERV diversity and evolutionary relationships.

Core Project Overview

Property Description
Scope Retroviruses (family Retroviridae - endogenous and exogenous)
Development Period 2024-present
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI Nucleotide, NCBI Genomes via the DIGS Tool
Associated Tools BLAST+, MAFFT, RAXML
Offline Project GitHub (Private)
Online Access None as yet
Status Under development
User Guide None Yet

Extension Layers

  • [ERVdb-Homo-sapiens]: Adds all ERV data from the human (Homo sapiens) genome.
  • [ERVdb-Mus-musculus]: Adds all ERV data from the mouse (Mus musculus) genome.


Lentivirus-GLUE

Background

Lentivirus-GLUE is a specialized resource designed to support the support comparative genomic and evolutionary analysis of lentiviruses.

Lentiviruses are complex retroviruses that cause chronic diseases in humans and animals, with HIV being the most well-known example. Their high variability and ability to evade the immune system make them particularly challenging to manage and treat.

Scope and History

Lentivirus GLUE was first developed as an HIV project, and later expanded to encompass all lentiviruses, including ERVs.

The HIV and primate lentivirus components were developed as part of the research activities of the Centre for HIV RNA Studies (CRNA) from 2017-2022.

The small ruminant lentivirus (SRLV) extension was developed in association with an investigation into the origin and emergence of pandemic SRLV infection, published in 2023.

The ERV components were compiled in association with the discovery of Springhare endogenous lentivirus, a lentiviral ERV identified in the genome of a rodent. This study included comprehensive mapping of lentivirus ERVs in published whole genome sequence (WGS) assemblies.

Features

  • Comprehensive Genomic Database: Integrates data from all known lentivirus sequences, including species-specific NCBI extension projects that add all lentivirus sequences in GenBank, providing a robust foundation for comparative genomics research.

  • Genotyping: MLCA-based assignment of genotypes and subtypes.

  • Translation Engine: Models ribosomal frameshifting, supporting accurate translation for all retrovirus species, including lentiviruses, to ensure precise protein-level analysis.

Core Project Overview

Property Description
Scope Lentiviruses (genus Lentivirus)
Development Period 2018-2024
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI
Associated Tools BLAST+, MAFFT, RAXML
Offline Project GitHub
Online Access None as yet
Status Mature. Actively being developed
User Guide GitHub Wiki

Extension Projects

Lentivirus-GLUE can be extended with additional layers, openly available via GitHub, including:

  • Lentivirus-GLUE-Primates: adds all NCBI sequence data for primate lentiviruses, including HIV-1 and HIV-2.
  • Lentivirus-GLUE-SRLV: small ruminant lentiviruses (SRLVs), adds all NCBI sequence data plus curated metatdata and analysis logic.
  • Lentivirus-GLUE-EIAV: equine infectious anemia virus (EIAV), adds all NCBI sequence data.
  • Lentivirus-GLUE-FIV: feline immunodeficiency viruses (FIVs), adds all NCBI sequence data.
  • Lentivirus-GLUE-ERV: adds lentivirus sequences that occur as endogenous retroviruses (ERVs).


Deltaretrovirus-GLUE

Background

Deltaretroviruses are a group of retroviruses that primarily infect mammals, including humans. They are characterized by their ability to establish lifelong infections, often leading to severe diseases such as leukemia and lymphoma. Notable deltaretroviruses include Human T-lymphotropic virus type 1 (HTLV-1), which causes adult T-cell leukemia/lymphoma (ATLL), and Bovine Leukemia Virus (BLV), which induces leukemia in cattle.

Scope and History

Deltaretrovirus-GLUE was originally developed to support paleovirological investigations of Deltaretrovirus-derived ERVs. It was later expanded to incorporate all published deltaretrovirus sequences.

Features

  • Comprehensive Genomic Database: Integrates data from all known lentivirus sequences, including species-specific NCBI extension projects that add all lentivirus sequences in GenBank, providing a robust foundation for comparative genomics research.

Core Project Overview

Property Description
Scope Deltaretroviruses (genus Deltaretrovirus)
Development Period 2018-2024
Lead Developers Robert J. Gifford
Main Objectives Comparative genomics, Molecular epidemiology
Data Sources NCBI
Associated Tools BLAST+, MAFFT, RAXML
Offline Project GitHub
Online Access None as yet
Status Mature. Not currently being developed
User Guide GitHub Wiki (Under Construction)


Clone this wiki locally