Identifying_Orthologues

This document outlines the workflow used to identify orthologues between two different genomes and to analyze specific query sequences from one genome in the context of another. The main tools used in this process are OrthoFinder for identifying orthologous genes and BLAST for sequence similarity searching. The idea is that Orthofinder finds orthologues genes based on their peptide sequence and then using Blastn with a more stringent parameter, we can narrow some of those to their similarity based on their coding sequence and by using both strategies we reduce the amount of false positive orthologues in our analysis.

Workflow Overview

Orthologue Identification with OrthoFinder:
- Purpose: Identify orthologous genes between two genomes using peptide sequnece to maintain hierarchical biological significance.
- Input: Peptide sequences from both genomes.
- Process: Run OrthoFinder on the input sequences to generate orthogroups representing orthologues across the genomes.
- Output: Orthogroups file listing the identified orthologues.
BLASTn Searching:
- Purpose: Search for specific gene coding squenes (CDS) as query from Genome A within Genome B.
- Input: Set of query gene CDS sequences from Genome A and CDS of Genome B.
- Process: Perform BLAST searches of these query sequences against the CDS sequences of Genome B.
- Output: BLAST result files containing matches and their details (e.g., alignment scores, E-values).
Integration and Analysis:
- Purpose: Determine if the query sequences and their corresponding BLAST hits in Genome B are part of the same orthogroups.
- Process:
  - Cross-reference the BLAST results with the OrthoFinder orthogroups.
  - Check if both the query and the subject IDs from the BLAST results are located in the same orthogroup.
- Output: A table containing the orthogroup number, the relevant BLAST results for matches where both sequences are in the same group from Orthofinder and Functional annotation identifier. file outputs:
  1. GOI_in_genome_b_results.csv - Table with all information from gene in genome A to gene in genome b, Orthogroup , BLASTn results and Functional annotation
  2. Final_GOI_in_Genome_b.csv - Table with just the loci idenfified in genome b to be ortologues to the GOIs
  3. Genome_A_GOIs_in_Genome_B_with_OGs.csv - Table with GOIs in genome A, GOIs in genome B, and Blastn results

Directory Structure

Home/ortho-genomes - Directory containing inout files for orthofinder for both genomes. Files should be:

Peptides_genome_a.fa = peptide sequences of all genes (proteome) from Genome A
Peptides_genome_b.fa = peptide sequences of all genes (proteome) from Genome B

Home/ - Directory containing remaining input files for pipeline:

Genes_of_interest.fa = fasta file containing all CDS sequences from genome A interested in finding in genome B
CDS_genome_b.fa= Gene coding seuqnece (CDS) from Genome B
Gene_annotations.txt= Text file that has one column with GOI_ID and a functional annotation on a different column ; Column name used for this GOI_ID identifier should replace <'TF_ID'> in line 59 of this code

Requirements

OrthoFinder: Installation guide and documentation
BLAST: Installation guide and documentation

Usage

To run this workflow, follow these steps:

Prepare your peptide and CDS sequence files in FASTA format and place them in the /Home/ and Home/ortho-genomes directories .
Run Orthofinder+Blastn_Orthology.sh
Analyze results from final table: GOI_in_genome_b_results.csv

Contact

For further questions or troubleshooting, please contact gih0004@vt.edu.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
LICENSE		LICENSE
Orthofinder+Blastn_Orthology.sh		Orthofinder+Blastn_Orthology.sh
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Identifying_Orthologues

Workflow Overview

Directory Structure

Requirements

Usage

Contact

About

Releases

Packages

Languages

License

gih0004/Identifying_Orthologues

Folders and files

Latest commit

History

Repository files navigation

Identifying_Orthologues

Workflow Overview

Directory Structure

Requirements

Usage

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages