Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Roadmap / todos #1

Open
vadimnazarov opened this issue Mar 12, 2015 · 0 comments
Open

Roadmap / todos #1

vadimnazarov opened this issue Mar 12, 2015 · 0 comments
Assignees

Comments

@vadimnazarov
Copy link
Member

vadimnazarov commented Mar 12, 2015

2.0 version

MAAG

  • Implement shifts in event probabilities.
    • In Clonotypes.
    • In MAAGBuilder.
  • Replace generation probability computing with forward-algo-like procedure.
  • Add MarkovChain with errors.
    • Tests
  • Implement errors in MAAGBuilder.
    • V.
    • D.
    • J.
  • Implement errors in MAAGForward-backward.
    • VJ (waiting for 100K test)
    • VDJ
  • Implement errors in alignments.
  • Fix replacement of MAAG event probe in MAAGBuilder.
  • Add move assignment operator to MAAG.
  • With which value initialise error probability?

PAM

  • Implement a PAM + inference algorithm with errors in alignments.
    • VJ
    • VDJ
  • Fix segfault
    "There are four common mistakes that lead to segmentation faults: dereferencing NULL, dereferencing an uninitialized pointer, dereferencing a pointer that has been freed (or deleted, in C++) or that has gone out of scope (in the case of arrays declared in functions), and writing off the end of an array.
    A fifth way of causing a segfault is a recursive function that uses all of the stack space. On some systems, this will cause a "stack overflow" report, and on others, it will merely appear as another type of segmentation fault. "

IO

  • Fix Python converter (V / D / J alignments column instead of starts/ends columns)
  • Fix writer
  • Refactor parser.
  • Refactor parser with the new aligner with virtual functions instead of templates.
  • Implement a separate class for align all genes on clonotypes sequences. Pass it as a object to Parser if you (user) want to.
    • Implement SW local aligner for Variable genes.
    • Implement SW local aligner for Joining genes.
  • Add translation subroutine.
  • Add aligner parameters for alignment - thresholds for length / score, etc.

2.1 version

MAAG

  • Add MarkovChain to MAAG (for amino acids).
    • VJ
    • VDJ
  • Implement MAAGaa
    • VJ
    • VDJ
  • Implement amino acid sequence MAAG builder.
    • Tests.

IO

  • Implement amino acid aligner.
    • VJ
      • Tests.
    • VDJ
      • Tests.

2.2 version

PAM

  • Data diversity measure.
  • Implement and test new secret EM algorithm.
    • Save #iter for each parameter, not globally.

2.3 version

Optimisations

2.4 version

Docs

  • Add support for high precision numbers or decide to work only with long doubles.
  • Write API documentation using Doxygen.
  • Write general / usage documentation using MkDocs.
  • Publish all documentation on GitHub pages.

2.5 version

IO

  • MAAG serialization.
    • Binary representation.
      • Tests.
    • Reading.
      • Tests.
    • Writing.
      • Tests.
  • ??? Memory mapped MAAG repertoire in case of very large files (align -> save to disk -> read from the memory mapped file).

Far Future

MAAG

  • Add checks for zero or error gene segments and other events in MAAG builder.

AAPAG

  • Implement AAPAG (Amino Acid Pattern Assembly Graph).
  • Implement fast generation of neighbour amino acid sequences.

Optimisations

  • Play with SIMD https://github.com/p12tic/libsimdpp
    • markov chains, probs in forward-backward
    • computing of full probabilities
  • Rewrite all using templates - in this case code will be without unnecessary "ifs". Basic scripts (compute, inference and generate) for each possible recombination.
  • Do return value optimisation everywhere when possible.
  • Check if lazy evaluation can be added anywhere.
  • Decide to refactor or not MarkovChain in MAAGBuilder.
  • Branching (if - statements) optimisations.
    • Try to always build event indices MMC, just do not include it to the resulting MAAG.
    • Move if (full_build) from the cycles to their own out cycles with only one cycle in MAAGBuilder.
    • ?: instead of if-else in MAAGBuilder deletions and insertions.
  • Check speed in ClonotypeBuilder in returning void vs returning ClonotypeBuilder& procedures.
  • Use fixed-size matrices in some cases like VJ deletions because all VJ gene segments sequences are pretty similar in size. (???)
  • Rewrite ModelParameterVector with plain arrays.
  • Optimise sequence class (currently std::string, need speed and memory improvements using bit vectors).
  • Compilation options which removes all verbosing for speed.

Refactoring

  • Replace all raw pointer with std::unique_ptr.
  • Add Google Test instead of my test.
  • Shared ptr for VDJRecombinationGenes.

Other

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

1 participant