Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

pairtools 1.0.0 updates #117

Merged
merged 52 commits into from
Jun 1, 2022
Merged

pairtools 1.0.0 updates #117

merged 52 commits into from
Jun 1, 2022

Conversation

agalitsyna
Copy link
Member

@agalitsyna agalitsyna commented Apr 7, 2022

pairtools v1.0.0 roadmap updates: #116

* handle empty chromosomes, resolved
#76

* fixed rfrags indexing and first rfrag omission, resolved
#73

* resolved or deprecated #16

* pairtools restrct tests
- resolved #61

- option to add only the first header in merge, resolved
#18
* in merge, added option to concatenate instead of merge sorted inputs,
resolving: #23

* merge checks that columns of inputs are the same
@agalitsyna agalitsyna changed the title pairtools 0.4.0 updates pairtools 1.0.0 updates Apr 8, 2022
- auto_open defaults to stdin/stdout when path evaluates to False.
resolved #48

- auto_open defaults to stdin/stdout when the path is "-"

- if the stream is optional, it's controlled by the module itself

Warning: this might be unstable because not all the usecases were
tested!
Improved version of parse2 with resolved comments from the previous PR: #96

- Separation of parse and parse2 modules. Parse has an option --walks-policy all, which parses long walks, but always reporting pair orientation and outer positions of 5'-ends, as if each pair was read in paired-end mode independently. Parse2 is specifically designed for long walks, and has options --report-position and --report-orientation, which might be used to report junctions, or reads, or walks.

- Parse2 has an option to parse single-end reads, --single-end option, tested on minimap2 output for MC-3C.

- Parse2 has the max_fragment_size instead instead of parse's max_molecule_size, which help to determine the overlapping ends of forward and reverse reads.

- Recent update simplifies the code: single _parse library used by both parse and parse2,

- a number of functions that reduce repetitive code, e.g. push_pair function,

- dosctrings and documented structure of _parse library.

- Both parse and parse2 have the options to report 5' or 3' ends; to flip alignments according to chromosome coordinate.

- Both parse and parse2 have the pysam backend

- Improvements of the tests for parse and parse2

- Documentation includes description of various --report-orientation and --report-position cases.
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@agalitsyna agalitsyna mentioned this pull request Apr 14, 2022
31 tasks
- new module called by `pairtools header`
- submodules: 
  - generate : Generate the header
  - set-columns : Add the columns to the .pairs/pairsam file
  - transfer : Transfer the header from one pairs file to another
  - validate-columns : Validate the columns of the .pairs/pairsam file
- resolves #119 
- option remove-columns for `pairtools select`: Remove the columns from .pairs/pairsam file
* --tag-mode parameter: XA and XB modes for original and github versions of bwa

* XB number of fields problem resolved

* comparison of str to integer in alt scores resolved (did not work before!)

* reporting scores of optimal, suboptimal and second suboptimal scores added (controlled by --report-scores)

* explanations of the inner working added to the docs
…set; - sphinx doc update (added pysam); - header warning if empty and error if try to add a field to empy one
@agalitsyna agalitsyna mentioned this pull request Apr 26, 2022
agalitsyna and others added 6 commits April 26, 2022 16:04
* Add summaries

* Add functions for duplication tile and complexity

* Make dedup stats!

Co-authored-by: Aleksandra Galitsyna <agalitzina@gmail.com>
agalitsyna and others added 14 commits April 27, 2022 15:28
* Filtering stats

* Multiple filters; save the filtering expression

* distance bins saved, saving and reading yaml works

* update test

* select is now a separate library used by stats and select CLI

* Python engine for stats filtering added

* black stats and select

* travis fix yaml

Co-authored-by: Aleksandra Galitsyna <agalitzina@gmail.com>
@agalitsyna agalitsyna merged commit d7d0939 into master Jun 1, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants