Skip to content
Pierre Chaumeil edited this page Nov 23, 2022 · 2 revisions

GTDB R202

  • Assembly summary files downloaded on the 23th of Sept 2020
  • Assembly summary files downloaded on the 23th of Sept 2020
  • LPSN data downloaded on the 24th of Sept 2020
  • BacDive processed on the 14th of October 2020
  • RS_GCF_005860925.2,RS_GCF_005862185.2,RS_GCF_011765685.1 didn not have a ncbi_taxonomy so it has been added manually

Step timeline

  • Rsync genomes : ~ 22h with 2 x 75cpus
  • Update Refseq genomes: ~ 43h with 1x50cpus
  • Update Genbank genomes: ~ 16h with 1x50cpus
  • Prodigal : ~10h with 1x75cpus (~25K genomes)
  • Hmmsearch TIGR RS : ~14 with 1x30 (7,6K genomes)
  • Hmmsearch TIGR GB : ~10h30 with 1x75 (18K genomes)
  • Hmmsearch PFAM RS : ~14h with 1x75cpus (8K genomes)
  • Hmmsearch PFAM GB : 3days16hrs with 1x75cpus (~85K genomes)