Skip to content

Program style

Joseph W. Brown edited this page Dec 12, 2019 · 8 revisions

The goal of phyx is to provide a set of command line tools for unix/gnu/linux (or anything else with a c++ compiler I suppose) that are intuitive and have a similar structure and feel across each program. This is similar for common unix tools like ls, rm, cd, and so on. In order to achieve this, there need to be some general style guidelines that each program should follow to give the sense that each program is part of the package and to generally make it easier to interact with each program.

Program arguments

Program arguments should be provided in one of two ways: at the command line or with a simple config file. The config file will always take the same format where each line will have the option name and the option. The command line arguments will take the form short and long arguments like -h param or --help=param. This is achieved by using the getopt_long function from #include <getopt.h>. See one of the existing mains for an example. The --help should produce something like this

Filter fastq files by mean quality score and send
result to standard output. Can read from stdin or file

Usage: pxfqfilt [OPTION]... [FILE]...
Options:
 -m, --mean=VALUE    mean value under which seqs are filtered
 -s, --seqf=FILE     input sequence file, stdin otherwise
 -o, --outf=FILE     output sequence file, stout otherwise
 -h  --help          display this help and exit
 -V, --version       display version and exit
 -C, --citation      display phyx citation and exit

Report bugs to: <https://github.com/FePhyFoFum/phyx/issues>
phyx home page: <https://github.com/FePhyFoFum/phyx>

and the --version should be like

pxfqfilt 0.1
Copyright (C) 2013 FePhyFoFum
Liscence GPLv2
written by Stephen A. Smith (blackrim)

so please format accordingly. (Of course with you as the author or added to the author if you edit something).

Note: getopt strictly only allows one argument per option. To allow an arbitrary number of arguments (say, specifying the input as: -s *.NEX), see the hack here.

Man pages

If you do this as above, then manpages can be created automatically with help2man command. This is currently accomplished with the script generate_manpages.py. When adding new programs, make sure to generate the manpage and add cat man/$@.1.in > man/$@.1 to Makefile.in (see existing programs for an example).

File types

The file types that we support in all programs, where relevant (not that we support tree files in sequence programs), include fasta, fastq, phylip (extended), nexus, nexml (soon), and newick. For the most part, you don't have to specify the file as the programs should be detecting them.

Input/Output

Because most people use files, input and output files need to be supported. However, we will also support stdin/stdout in every file so that things can be piped. For an example of how to do this check pxfqfilt. If a procedure can possibly be executed in a stream (e.g., recoding), the program should be written with this goal. For example, do not read in and store an entire alignment if sequences can be processed individually; this will minimize the memory footprint significantly. In many cases, this will not be possible.

Clone this wiki locally