diff --git a/Cpla_contamination_removal.ipynb b/Cpla_contamination_removal.ipynb index fdfafd1..a831b3a 100644 --- a/Cpla_contamination_removal.ipynb +++ b/Cpla_contamination_removal.ipynb @@ -14,7 +14,7 @@ "source": [ "## Table of Contents\n", "\n", - "1. For ~3500 candidate genes that have >90% similarity to a gene in Orug, find their peptide sequences (MAGOT), find their best hit in 11 other Apoidea (BLASTP), align (MAFFT), and build ML phylogenies (RAxML)\n", + "1. For ~3500 candidate genes that have >90% similarity to a gene in Orug, find their peptide sequences (MAGOT), find their best hit in 11 other ants and bees (BLASTP), align (MAFFT), and build ML phylogenies (RAxML)\n", "2. Read in trees and analyse\n", "3. Construct dataframe with sequence and phylogenetic features\n", "4. Plots showing that these features indicate contamination\n", @@ -73,7 +73,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 1. For ~3500 candidate genes that have >90% similarity to a gene in Orug, find their peptide sequences (MAGOT), find their best hit in 11 other Apoidea (BLASTP), align (MAFFT), and build ML phylogenies (RAxML)\n", + "## 1. For ~3500 candidate genes that have >90% similarity to a gene in Orug, find their peptide sequences (MAGOT), find their best hit in 11 other ants and bees (BLASTP), align (MAFFT), and build ML phylogenies (RAxML)\n", "\n", "The best BLAST result for each of these resultant Cpla candidate genes from a range of Hymenoptera was determined." ] @@ -428,7 +428,7 @@ " for i in $(cat ${SPECIES}_candidate_seqs.txt); \n", " do \n", "\n", - " MAGOT get_seq_from_fasta '${SPECIES}.pep' $i True >> ${i}_seqs.fasta \n", + " python genome_tools.py get_seq_from_fasta '${SPECIES}.pep' $i True >> ${i}_seqs.fasta \n", "\n", " done\n", "done\n", diff --git a/Orug_contamination_removal.ipynb b/Orug_contamination_removal.ipynb index 4c899b4..857d5e0 100644 --- a/Orug_contamination_removal.ipynb +++ b/Orug_contamination_removal.ipynb @@ -14,7 +14,7 @@ "source": [ "## Table of Contents\n", "\n", - "1. For ~3500 candidate genes that have >90% blast similarity to a gene in Cpla, find their peptide sequences (MAGOT), find their best hit in 11 other Apoidea (BLASTP), align (MAFFT), and build ML phylogenies (RAxML)\n", + "1. For ~3500 candidate genes that have >90% blast similarity to a gene in Cpla, find their peptide sequences (MAGOT), find their best hit in 11 other ants and bees (BLASTP), align (MAFFT), and build ML phylogenies (RAxML)\n", "2. Read in trees and analyse\n", "3. Construct dataframe with sequence and phylogenetic features\n", "4. Plots showing that these features indicate contamination\n", @@ -73,7 +73,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 1. For ~3500 candidate genes that have >90% blast similarity to a gene in Cpla, find their peptide sequences (MAGOT), find their best hit in 11 other Apoidea (BLASTP), align (MAFFT), and build ML phylogenies (RAxML)\n", + "## 1. For ~3500 candidate genes that have >90% blast similarity to a gene in Cpla, find their peptide sequences (MAGOT), find their best hit in 11 other ants and bees (BLASTP), align (MAFFT), and build ML phylogenies (RAxML)\n", "\n", "The best BLAST result for each of these resultant Orug candidate genes from a range of Hymenoptera was determined." ] @@ -421,7 +421,7 @@ " for i in $(cat ${SPECIES}_candidate_seqs.txt); \n", " do \n", "\n", - " MAGOT get_seq_from_fasta '${SPECIES}.pep' $i True >> ${i}_seqs.fasta \n", + " python genome_tools.py get_seq_from_fasta '${SPECIES}.pep' $i True >> ${i}_seqs.fasta \n", "\n", " done\n", "done\n",