Skip to content

Update dockerfile to PyMC v4 #5881

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 8 commits into from
Jun 18, 2022

Conversation

danhphan
Copy link
Member

This PR updates the dockerfile to PyMC v4, which addresses some parts of the issue #5323

  • Update the Dockerfile to a fresh base image for PyMC 4.0.0
  • Fix NumPy blas warnings
  • Run pm.model_to_graphviz(model) successfully

Todos:

  • Update docs on how to build and run the docker image
  • Create a GitHub Action to build & push the Docker image (need guidance)

Thanks.

@codecov
Copy link

codecov bot commented Jun 12, 2022

Codecov Report

Merging #5881 (10e5a83) into main (d7b5153) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5881      +/-   ##
==========================================
+ Coverage   89.49%   89.52%   +0.03%     
==========================================
  Files          73       73              
  Lines       13267    13267              
==========================================
+ Hits        11873    11877       +4     
+ Misses       1394     1390       -4     
Impacted Files Coverage Δ
pymc/aesaraf.py 91.95% <100.00%> (ø)
pymc/data.py 81.63% <100.00%> (ø)
pymc/smc/smc.py 96.45% <100.00%> (ø)
pymc/step_methods/hmc/base_hmc.py 90.47% <0.00%> (+0.79%) ⬆️
pymc/parallel_sampling.py 86.79% <0.00%> (+0.99%) ⬆️

@danhphan
Copy link
Member Author

The docker guideline could be somethings like:

Build docker image

clone git@github.com:pymc-devs/pymc.git
cd pymc
bash scripts/build_container.sh

This building process may take a while. After the script successful finished, there should be a docker image named pymc in your machine.

How to run

You can now run the pymc docker container from jupyter notebook or from bash:

# Run with jupyter notebook
bash scripts/run_container_bash.sh
# Run with bash
bash scripts/run_container_bash.sh

The current docker document was commented in CONTRIBUTING.md. Should we update the docker guideline here in the CONTRIBUTING.md file? Or should it be moved into the documentation website like other parts (pull-request guideline, pre-commit guideline, ...), and then link to that docs.

cc @ricardoV94 @michaelosthege @twiecki @OriolAbril

Thank you.

Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

The CI pipeline will be an important addition, but let's do that on a follow-up PR.


# For running from jupyter notebook
EXPOSE 8888
CMD ["conda", "run", "--no-capture-output", "-n", "pymc-dev-py39", "jupyter","notebook","--ip=0.0.0.0","--port=8888","--no-browser","--allow-root"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can --allow-root be removed, since the user was switched?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agree, this makes sense. Let me remove it :)

@ricardoV94 ricardoV94 changed the title [WIP] Update dockerfile to PyMC v4 pdate dockerfile to PyMC v4 Jun 14, 2022
@ricardoV94 ricardoV94 changed the title pdate dockerfile to PyMC v4 Update dockerfile to PyMC v4 Jun 14, 2022
@ricardoV94 ricardoV94 marked this pull request as draft June 14, 2022 11:32
@OriolAbril
Copy link
Member

The current docker document was commented in CONTRIBUTING.md. Should we update the docker guideline here in the CONTRIBUTING.md file? Or should it be moved into the documentation website like other parts (pull-request guideline, pre-commit guideline, ...), and then link to that docs.

It should be on the docs, not on CONTRIBUTING.md (it could have a link there though if we think it is relevant enough)


ADD $SRC_DIR /home/jovyan/
RUN /bin/bash /home/jovyan/scripts/create_testenv.sh --global --no-setup
USER root
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Containers should not run as root. Is there a reason this is needed?

https://theorangeone.net/posts/containers-as-root/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi agree, it has been removed :)

@@ -0,0 +1,12 @@
#! /bin/bash

SRC_DIR=${SRC_DIR:-`pwd`}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this script be merged with run_container_jupyter.sh? Theyre so similar it seems itd be easier to maintain one, rather than two

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually can all three bash scripts be merged? One with a nice CLI would be better for both users and maintenance

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have combine them into one script, and also update the Docker documents.

@canyon289
Copy link
Member

Great stuff, thanks updating the dockerfile @danhphan!

@danhphan danhphan force-pushed the update_dockerfile_v4 branch from c8b907e to 195f931 Compare June 15, 2022 13:06
@danhphan
Copy link
Member Author

Hi, thank you all for the suggestions, I have:

  • removed the root user and --allow-root from the Dockerfile,
  • combined the three bash scripts into one,
  • updated the Docker documents in docs folder and also linked it in the CONTRIBUTING.md,
  • removed obsolete scripts (create_testenv.sh and install_miniconda.sh), but not sure if it is still used anywhere else?

@@ -0,0 +1,21 @@
(pr_checklist)=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is defining a target to which we can then link from anywhere in the docs to generate a link that points here. The pr_checklist label is already in use so it can't be used and it should be something specific to this file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, got it.

Comment on lines 11 to 12
$ bash scripts/docker_container.sh bash # running the container with bash
$ bash scripts/docker_container.sh bash # running the container on jupyter notebook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the $ so people can use the copy button that appears on code blocks directly. And both commands look the same, I think the 2nd one should use jupyter as argument or something of the sort.

```bash
$ bash scripts/docker_container.sh bash # logon to the container
$ cd ./pymc/tests
$ . ./../../scripts/test.sh # takes a while!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the dollar sign comment, do we use these scripts/test.sh in CI? Shouldn't the advise be to use pytest commands explicitly like we explain on https://www.pymc.io/projects/docs/en/stable/contributing/running_the_test_suite.html? We also recommend not running the whole test suite all at once unless necessary which that would seem to point against.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this content is from the old docs. I will remove them then. Also, not sure if the scripts/test.sh is used in any CI. If it is not used, we should also remove this script.


#Setup working forder
RUN mkdir /home/jovyan/workspace
WORKDIR /home/jovyan/workspace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can double check myself tomorrow, but I think the jupyter base image makes a directory /home/jupyter/work that you're supposed to use as a mount target. If so, I think you could remove 20-22. (also very small spelling nit: forder -> folder on line 20)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi yes, I will check it as well :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup just build the container, everything works great and mounts correctly. There is both work and workspace in there though. It's mentioned very briefly in their docs here: https://jupyter-docker-stacks.readthedocs.io/en/latest/using/common.html#additional-runtime-configurations. It'd be a little neater to remove RUN mkdir /home/jovyan/workspace and use WORKDIR /home/jovyan/work instead.

Also, just a heads up that jovyan should be replaced with $NB_USER in your Dockerfile, since that is something that's configurable too. I think it adds unnecessary complication to scripts/docker_container.sh though for this case, but might come in handy in other contexts. Just want to mention if by chance you hadn't seen it!

@twiecki twiecki marked this pull request as ready for review June 17, 2022 06:49
@twiecki
Copy link
Member

twiecki commented Jun 17, 2022

Is this good to merge?

@canyon289
Copy link
Member

No, we're waiting on Bills comment about working directory

@OriolAbril
Copy link
Member

And my comments about docs still need to be addressed it looks like


elif [[ $COMMAND = 'bash' ]]; then
docker run -it -v $SRC_DIR:/home/jovyan/workspace --rm --name ${CONTAINER_NAME} ${CONTAINER_NAME} bash
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to have an explicit elif for the jupyter option, not necessary though. I also think it'd be nice to keep the bit at the end that launches the notebook in the browser automatically but that's just a "nice to have" and could be a later PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi yes, the $COMMAND is set to jupyter as default in COMMAND="${1:-jupyter}". So, after the docker image is built, we can run with jupyter as a default option.

@bwengals
Copy link
Contributor

I tried running it with all three options, build, bash, jupyter, everything works as expected! Nicely done. Other than the docs thing Oriol mentioned and the workspace -> work I think it's just about ready.

@danhphan danhphan force-pushed the update_dockerfile_v4 branch from 195f931 to 10e5a83 Compare June 18, 2022 06:55
@danhphan
Copy link
Member Author

Hi, I have updated the docs and replaced workspace with work. Hopefully it is alright now :) Thank you

Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
There's no build or run script for Windows though, but we can also add that later.

@michaelosthege michaelosthege merged commit 0ba47b5 into pymc-devs:main Jun 18, 2022
@danhphan
Copy link
Member Author

Hi @michaelosthege , do you mean a script to create a Windows docker image? maybe we should not create a script for building a Windows-based docker image, since most docker image are linux-based, and most cloud systems use linux os. In case of users that have Windows os with docker on their machine, they can also use the ubuntu image we have here to build and run the docker container. Thanks.

@michaelosthege
Copy link
Member

michaelosthege commented Jun 20, 2022

No I mean not a Windows Docker image, but a docker_container.bat with the same functionality as docker_container.sh.

But it's not critical since docker_container.sh is mostly a convenience thing.
The same can be achieved by modifying the commands below:

# Build the image, tagged as "pymc:local"
docker build -t pymc:local .

# Run in Jupyter notebook mode
docker run -it -p 8888:8888 -v %cd%:/home/jovyan/work --rm --name pymcJupyter pymc:local

# Run in interactive bash mode
docker run -it -v %cd%:/home/jovyan/work --rm --name pymcBash pymc:local bash

@danhphan
Copy link
Member Author

Yes, totally agree :)

@danhphan danhphan deleted the update_dockerfile_v4 branch June 20, 2022 12:05
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants