UPC-BSG-TaxonomicAssignment

The goal of this project is writing an own implementation of a taxonomic assignment task. First, a real NCBI taxonomic reference (nodes.dmp) is transformed into a tree. Afterwords, a file is read containing reads of genomes (a set of taxes compatible to a new genomic sequence). An algorithm is then implemented to find the best taxonomic assignement for the given genomic sequence based on maximum F-measure.

Extract folders and navigate to this folder in CMD.
Download nodes.dmp (Taxonomy dataset) from ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz
Make sure 'nodes.dmp' (taxonomy dataset) and 'sample.inp' (genome reads) are in the same directory!
Execute: 'python full_pipeline.py'

--> To change input (location or name), run 'python full_pipeline.py -h' for instructions.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
full_pipeline.py		full_pipeline.py
how_to_run.txt		how_to_run.txt
sample.inp		sample.inp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UPC-BSG-TaxonomicAssignment

About

Releases

Packages

Languages

LouisVanLangendonck/UPC-BSG-TaxonomicAssignment

Folders and files

Latest commit

History

Repository files navigation

UPC-BSG-TaxonomicAssignment

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages