-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
fix: various little details #148
Conversation
WalkthroughThe pull request introduces updates to configuration files and workflow specifications, including a refined path specification for specific PDF files in a GitHub Actions workflow and the addition of resource configurations for bioinformatics tools in a new YAML file. These changes are aimed at enhancing the structure and clarity of both the project's workflow and resource management. Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Artifacts |
@coderabbitai review |
Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Outside diff range and nitpick comments (10)
.github/workflows/release-please.yml (1)
43-47
: Explicit file paths provide better control, but consider flexibility.The change from a wildcard pattern to specific file paths gives better control over which files are included in the artifact. This ensures only the necessary files are uploaded.
However, this approach might be less flexible if new PDF files are added in the future. Consider using a combination of specific paths and wildcards to balance control and flexibility.
For example:
path: | slides/Snakemake_HPC_*.pdf !slides/Snakemake_HPC_draft*.pdfThis would include all Snakemake HPC PDF files except drafts.
config/config_Mainz_NHR.yaml (2)
58-59
: LGTM: Cluster distribution specifiedThe addition of the
distro
field specifying "Alma-Linux" is beneficial. It provides important information about the cluster's operating system, which can be crucial for compatibility and troubleshooting.Consider adding a comment explaining why this information is important, e.g.:
# Specify the cluster's Linux distribution for compatibility checks distro: "Alma-Linux"
66-67
: LGTM: Display program specified, consider placementThe addition of the
display_program
field is useful for specifying how SVG files should be displayed in the cluster environment. This can be particularly helpful for scripts or workflows that generate SVG outputs.Consider moving this field closer to the other cluster-specific configurations for better organization. For example, you could place it right after the
distro
field:cluster: name: "Mogon-NHR" distro: "Alma-Linux" display_program: "display" account: "nhr-workflow" # ... rest of the cluster configurationThis placement would group all cluster-specific settings together, improving the overall readability of the configuration file.
slides/common/Plotting_DAGs.tex (1)
3-16
: Excellent visual enhancement and content restructuring!The introduction of a background template with a complex rulegraph image and the new frame with a focused table of contents significantly improves the visual appeal and structure of the presentation. The use of mdframed for better readability is a nice touch.
Consider adding a brief explanation or legend for the background image to help the audience understand its relevance to the topic.
slides/common/mod_cs_mainz.tex (3)
3-17
: Great improvement to the presentation structure and visual appeal!The new frame with the background image and styled table of contents enhances the overall look of the presentation. The focused table of contents and the proper attribution for the image are excellent additions.
Consider adding a brief comment explaining the significance of sections 1-4 in the table of contents, if applicable. This could provide context for why these specific sections are highlighted.
82-82
: Good use of a variable for the cluster account!This change improves the flexibility and maintainability of the document.
Consider using a consistent naming convention for variables throughout the document. For example, you might want to use
\VAR{cluster_account}
instead of\verb,<++cluster.account++>,
if this aligns with other variable usages in your LaTeX setup.
Line range hint
1-89
: Well-structured and informative presentation content!The overall structure and content of the presentation are excellent. The logical flow, use of images, and clear explanations contribute to an effective learning experience for the audience.
To further enhance the presentation, consider adding a summary or recap slide at the end. This could reinforce key points and provide a clear takeaway for the audience. Additionally, if time permits, you might want to include a slide with resources for further learning or support.
slides/common/HPC_101.tex (3)
3-16
: Great addition of visual elements and improved navigation!The new background template with a cluster computing image and the "Using Code-Studio" frame with a section-specific table of contents significantly enhance the presentation's visual appeal and navigability.
Consider adding a brief description or caption for the background image to provide context for the audience. You could add this as a comment in the LaTeX code for future reference:
% Background image: Cluster computing visualization \usebackgroundtemplate{ \vbox to \paperheight{\vfil\hbox to \paperwidth{\hfil\includegraphics[height=\paperheight]{misc/cluster_kit}\hfil}\vfil} % source: https://en.m.wikipedia.org/wiki/File:Text-x-python.svg }
112-135
: Excellent addition of practical SLURM workflow examples!The new frame effectively illustrates how to construct complex workflows using SLURM commands in a Bash script. This practical example will greatly benefit learners in understanding real-world applications of cluster computing.
To further improve readability and understanding:
- Consider adding comments to explain the purpose of each job (job1.sh, job2.sh, job3.sh).
- You might want to highlight the
--dependency
flag usage with a brief explanation.Here's a suggested modification:
\begin{lstlisting}[language=Bash, style=Shell] # First, do some pre-processing for the first job. ... # Submit the first job (e.g., data preparation) jid1=$(sbatch ... job1.sh) # NOTE: ALL 'job*sh' scripts are bash scripts, # with more logic than the "hello world" script. # Next, do some more logic as pre-processing for the # follow-up jobs. ... # Multiple jobs can depend on a single job # Here, we use --dependency=afterany to ensure job2 and job3 # only start after job1 has completed (successfully or not) jid2=$(sbatch --dependency=afterany:$jid1 ... job2.sh) # e.g., data analysis jid3=$(sbatch --dependency=afterany:$jid1 ... job3.sh) # e.g., result visualization \end{lstlisting}
Line range hint
1-137
: Overall excellent improvements to the presentation structure and content!The changes made in this pull request significantly enhance the quality and effectiveness of the HPC_101 presentation. Key improvements include:
- Addition of visually appealing elements (background image, formatted table of contents).
- Clearer objectives and context for learners.
- Practical examples of SLURM workflow construction.
- Improved readability of both the presentation and the LaTeX source code.
These modifications align well with the PR objectives of addressing minor issues and refining details encountered during the summer course.
As a final consideration, to ensure consistency across the entire presentation:
- Review other sections of the presentation to see if similar enhancements (e.g., background images, section-specific table of contents) could be applied.
- Consider creating a LaTeX command or environment for frequently used formatting (like the mdframed table of contents) to ensure consistency and ease of future updates.
Example:
\newcommand{\sectiontoc}{ \begin{mdframed}[tikzsetting={draw=white,fill=white,fill opacity=0.8, line width=0pt},backgroundcolor=none,leftmargin=0, rightmargin=150,innertopmargin=4pt,roundcorner=10pt] \tableofcontents[currentsection,sections={1-4},hideothersubsections] \end{mdframed} }Then you can use it like this:
\frame{ \frametitle{Using Code-Studio} \sectiontoc }This approach would make it easier to maintain a consistent style throughout the presentation and simplify future updates.
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (2)
images/humor/DALLE_LEGO-scientist-coding.jpg
is excluded by!**/*.jpg
images/misc/cluster_kit.jpg
is excluded by!**/*.jpg
Files selected for processing (10)
- .github/workflows/release-please.yml (1 hunks)
- config/config_Mainz_NHR.yaml (1 hunks)
- setup_creators/condarc_mogon (1 hunks)
- setup_creators/tutorial/profile/Mogon_NHR/config.yaml (1 hunks)
- slides/Snakemake_HPC_User_Creator_Combi.tex (1 hunks)
- slides/common/HPC_101.tex (3 hunks)
- slides/common/Plotting_DAGs.tex (1 hunks)
- slides/common/Sample_Data.tex (1 hunks)
- slides/common/mod_cs_mainz.tex (2 hunks)
- slides/creators/Python_in_Snakemake.tex (2 hunks)
Files skipped from review due to trivial changes (1)
- slides/common/Sample_Data.tex
Additional comments not posted (11)
setup_creators/condarc_mogon (3)
13-13
: Approved: Addition ofchannel_priority: strict
This change enforces a strict channel priority, which can lead to more predictable and reproducible environments. Conda will now only consider the highest-priority channel that provides a package when resolving dependencies.
Note that this may limit access to newer package versions in lower-priority channels. Ensure this aligns with your project's requirements for package versioning and availability.
14-14
: Approved: Addition of custom environment promptThe new
env_prompt
setting enhances user experience by displaying the current Conda environment name in the command prompt. This visual cue helps prevent accidental use of the wrong environment, especially when working with multiple projects.The format
"($(basename {default_env})) "
is well-chosen:
- It uses parentheses for clear separation.
basename
ensures only the environment name is shown, not the full path.- The trailing space provides good separation from the rest of the prompt.
Line range hint
2-5
: Please clarify the removal of ther
channelThe AI-generated summary indicates that the
r
channel has been removed from the list of channels. However, this change is not visible in the provided code snippet. Could you please confirm if this change was intentional?If the
r
channel was indeed removed:
- What was the reason for its removal?
- How will this affect users who rely on R packages in their Conda environments?
- Is there an alternative source for R packages that users should be aware of?
To confirm the channel list changes, please run the following command:
This will show the exact changes made to the channel list.
setup_creators/tutorial/profile/Mogon_NHR/config.yaml (1)
1-2
: LGTM: Default resources configuration looks good.Setting a default SLURM partition to "small_cpu" is a good practice for resource management, especially for lightweight tasks or testing purposes.
config/config_Mainz_NHR.yaml (2)
56-57
: LGTM: Cluster name addedThe addition of the
name
field with the value "Mogon-NHR" is a good practice. It clearly identifies the specific cluster being configured, which enhances the configuration's clarity and maintainability.
54-67
: Overall: Good improvements to cluster configurationThe changes made to the cluster section of this configuration file are beneficial. They provide more detailed information about the cluster environment, which can be crucial for proper setup and troubleshooting. The additions of the cluster name, distribution, and display program all contribute to a more comprehensive and clear configuration.
These changes align well with the PR objective of fixing "various little details". They enhance the configuration's clarity and usefulness without introducing any significant risks or issues.
slides/Snakemake_HPC_User_Creator_Combi.tex (1)
52-54
: Approved: Inclusion of HPC_101 moved to a more suitable positionThe inclusion of
common/HPC_101
has been moved to a more appropriate position in the presentation structure, now placed after the software environment introduction and before selecting workflows. This change likely improves the logical flow of the presentation.To ensure the change doesn't negatively impact the presentation flow, please run the following script to verify the order of included files:
Review the output to confirm that the order of topics makes sense for your presentation flow.
Verification successful
Verified: Inclusion of HPC_101 correctly positioned
The order of included files in
slides/Snakemake_HPC_User_Creator_Combi.tex
confirms thatcommon/HPC_101
is correctly placed aftercommon/software_environment
and beforeusers/Selecting_Workflows
, ensuring a logical flow in the presentation structure.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the order of included files in the LaTeX document # Test: Extract and display the order of included files sed -n 's/^\\include{\([^}]*\)}.*/\1/p' slides/Snakemake_HPC_User_Creator_Combi.texLength of output: 496
slides/common/HPC_101.tex (2)
31-31
: Excellent addition to the objectives!The new objective "We want to give you an idea of building a workflow with pure batch system commands, first." provides valuable context for the upcoming content. It sets clear expectations for the learners and aligns well with the overall structure of the presentation.
136-137
: Approved: Improved source code readabilityThe addition of empty lines between frames enhances the readability of the LaTeX source code. This is a good practice for maintaining clean and organized code.
slides/creators/Python_in_Snakemake.tex (2)
366-371
: LGTM! Improved structure and added useful shorthand.The changes enhance the presentation structure and provide a helpful shorthand option:
- The new task environment around the debug trial command improves readability and organization.
- The introduction of the
-F
shorthand for--forcerun
is a useful addition for users.These modifications will make the content more accessible to the audience and provide a convenient option for running Snakemake commands.
Also applies to: 376-377
388-390
: LGTM! Enhanced flexibility for different environments.The changes improve the adaptability of the presentation:
- Adding quotation marks around the Linux distribution name enhances readability.
- Replacing the static reference to
display
with a dynamic placeholder<++cluster.display_program++>
allows for greater flexibility across different environments.These modifications make the presentation more versatile and easier to customize for various clusters or setups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range and nitpick comments (3)
setup_creators/tutorial/profile/Mogon_NHR/config.yaml (3)
1-10
: Great header comments, minor formatting improvements suggested.The header comments provide excellent context and guidance for using this configuration file. They clearly explain the purpose of the file and factors to consider when adjusting settings.
To improve readability and adhere to YAML best practices:
- Remove trailing spaces on lines 2 and 8.
- Consider using a consistent comment style. For example, you could use
#
followed by a single space for all comments.Here's a suggested format:
# Resource configuration for bioinformatics workflow on Mogon NHR # # This file defines resource allocations for various bioinformatics tools. # Adjust these settings based on your specific dataset characteristics: # - Input file sizes # - Genome complexity # - Coverage depth # # Remember to monitor job performance and adjust resources as needed.Tools
yamllint
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 8-8: trailing spaces
(trailing-spaces)
14-29
: Resource configurations are well-defined, but require formatting improvements.The resource configurations for each tool (bwa_map, samtools_sort, samtools_index, and bcftools_call) are well-structured and include helpful comments. However, there are some inconsistencies in indentation and formatting that should be addressed to improve readability and adhere to YAML best practices.
Please consider the following improvements:
- Use consistent indentation throughout the file. YAML typically uses 2 spaces for indentation.
- Align the
runtime
andmem_mb_per_cpu
keys within each tool's configuration.- Remove trailing spaces (e.g., on line 18).
Here's a suggested format:
set-resources: bwa_map: runtime: 5 # Adjust based on your dataset size. Typical range: 30-120 minutes mem_mb_per_cpu: 1800 # Increase for larger datasets, e.g., 3600 for 30x coverage samtools_sort: runtime: 5 # Adjust based on your dataset size. Typical range: 20-60 minutes mem_mb_per_cpu: 3600 # Sorting requires more memory. Increase for larger datasets samtools_index: runtime: 2 # Indexing is usually quick. Adjust if needed for very large files mem_mb_per_cpu: 1800 # Indexing typically requires less memory than sorting bcftools_call: runtime: 15 # Variant calling can be time-consuming. Adjust based on genome size and coverage mem_mb_per_cpu: 2400 # Increase for larger genomes or higher coverage datasetsThese changes will make the configuration more consistent and easier to read.
Tools
yamllint
[warning] 15-15: wrong indentation: expected 2 but found 8
(indentation)
[warning] 16-16: wrong indentation: expected 10 but found 11
(indentation)
[error] 18-18: trailing spaces
(trailing-spaces)
[error] 19-19: syntax error: expected , but found ''
(syntax)
[warning] 20-20: wrong indentation: expected 6 but found 8
(indentation)
[warning] 24-24: wrong indentation: expected 6 but found 8
(indentation)
[warning] 28-28: wrong indentation: expected 6 but found 8
(indentation)
14-29
: Consider adding a comment about tutorial settings.The current resource settings are appropriately low for tutorial data, as confirmed in past discussions. However, it might be helpful for future users to explicitly state that these are tutorial settings.
Consider adding a comment at the beginning of the
set-resources
section to clarify this:set-resources: # Note: These settings are optimized for tutorial data and are intentionally low. # For production use with larger datasets, increase these values accordingly.This addition will help prevent confusion for users who might be adapting this configuration for non-tutorial use.
Tools
yamllint
[warning] 15-15: wrong indentation: expected 2 but found 8
(indentation)
[warning] 16-16: wrong indentation: expected 10 but found 11
(indentation)
[error] 18-18: trailing spaces
(trailing-spaces)
[error] 19-19: syntax error: expected , but found ''
(syntax)
[warning] 20-20: wrong indentation: expected 6 but found 8
(indentation)
[warning] 24-24: wrong indentation: expected 6 but found 8
(indentation)
[warning] 28-28: wrong indentation: expected 6 but found 8
(indentation)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- .github/workflows/release-please.yml (1 hunks)
- setup_creators/tutorial/profile/Mogon_NHR/config.yaml (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- .github/workflows/release-please.yml
Additional context used
yamllint
setup_creators/tutorial/profile/Mogon_NHR/config.yaml
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 8-8: trailing spaces
(trailing-spaces)
[warning] 15-15: wrong indentation: expected 2 but found 8
(indentation)
[warning] 16-16: wrong indentation: expected 10 but found 11
(indentation)
[error] 18-18: trailing spaces
(trailing-spaces)
[error] 19-19: syntax error: expected , but found ''
(syntax)
[warning] 20-20: wrong indentation: expected 6 but found 8
(indentation)
[warning] 24-24: wrong indentation: expected 6 but found 8
(indentation)
[warning] 28-28: wrong indentation: expected 6 but found 8
(indentation)
Additional comments not posted (2)
setup_creators/tutorial/profile/Mogon_NHR/config.yaml (2)
11-12
: Default resources section looks good.The default resources section correctly defines the SLURM partition as 'small_cpu'. This provides a good baseline for all jobs in the workflow.
1-29
: Overall, excellent configuration file with minor improvements suggested.This configuration file for the bioinformatics workflow on Mogon NHR is well-structured and provides clear guidance for resource allocation. The header comments are informative, and the resource settings for each tool are appropriately configured for tutorial data.
Key strengths:
- Clear and informative header comments
- Well-defined default resources
- Tool-specific configurations with helpful comments
Suggested improvements:
- Minor formatting and indentation adjustments for better readability
- Addition of a comment clarifying that these are tutorial settings
Once these minor changes are implemented, this configuration file will serve as an excellent template for managing resources in bioinformatics workflows, particularly for educational purposes.
Tools
yamllint
[error] 2-2: trailing spaces
(trailing-spaces)
[error] 8-8: trailing spaces
(trailing-spaces)
[warning] 15-15: wrong indentation: expected 2 but found 8
(indentation)
[warning] 16-16: wrong indentation: expected 10 but found 11
(indentation)
[error] 18-18: trailing spaces
(trailing-spaces)
[error] 19-19: syntax error: expected , but found ''
(syntax)
[warning] 20-20: wrong indentation: expected 6 but found 8
(indentation)
[warning] 24-24: wrong indentation: expected 6 but found 8
(indentation)
[warning] 28-28: wrong indentation: expected 6 but found 8
(indentation)
all the little details, while teaching a block course during the summer break in Mainz, Germany.
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Documentation