Skip to content

The ideal multi-user Data Science server with Jupyterhub and RStudio, ready for Python, R and Julia languages.

License

Notifications You must be signed in to change notification settings

felipenoris/math-server-docker

Repository files navigation

math-server-docker

License

Dockerfile to build the ideal multi-user Data Science server with Jupyterhub and RStudio, ready for Python, R and Julia languages.

It's based on CentOS 7 image, which is a very stable Linux distribution, compatible with Red Hat (widely used in Corporations), but often offers outdated packages. In order to provide up-to-date tools, this Dockerfile builds most tools from source.

List of tools

There's also a number of utilities under the hood:

Usage

To build the image, run the following comand:

# docker build -t math-server .

Be patient. It may take up to 8 hours to complete.

After the build is complete, you can start the server with:

# docker run -d -p 8787:8787 -p 8000:8000 --name ms1 math-server

With a running container, you can go ahead and create users:

# docker exec ms1 useradd myuser

# docker exec -it ms1 passwd myuser

The default ports are:

  • 8787 for RStudio

  • 8000 for Jupyter

Requirements

Docker.

Application specific notes

The last command in the Dockerfile starts Jupyterhub and RStudio:

CMD /usr/lib/rstudio-server/bin/rserver \
	&& jupyterhub --no-ssl -f jupyterhub_config.py

Jupyter

Data files are at /usr/local/share/jupyter/hub.

By default, Jupyter will be accessible on the following link: http://localhost:8000, and will create state files (jupyterhub_cookie_secret, jupyterhub.sqlite) on current directory, and use default configuration.

You can generate a sample configuration file with:

# jupyterhub --generate-config

To start the server using a configuration file, use:

# jupyterhub -f jupyterhub_config.py

To set IP and port, use:

# jupyterhub --ip=192.168.1.2 --port=443

You may have to open port for external access:

# /sbin/iptables -I INPUT -p tcp -m tcp --dport 8000 -j ACCEPT
# /sbin/service iptables save

For https support, add the following lines to the config file:

c.JupyterHub.port = 443
c.JupyterHub.ssl_cert = '/root/.ssh/sample-cert.pem'
c.JupyterHub.ssl_key = '/root/.ssh/sample-key.pem'

443 is the default port for https. So the server will be accessible using https://localhost.

sample-cert.pem is the signed certificate file, and sample-key.pem is the private ssl key.

You can generate self signed certificate file by running the code below, but be aware that your browser will not recognize the certificate as trusted.

# mkdir ~/.ssh
# openssl req -x509 -newkey rsa:2048 -keyout ~/.ssh/sample-key.pem -out ~/.ssh/sample-cert.pem -days 9999 -nodes -subj "/C=BR/ST=Rio de Janeiro/L=Rio de Janeiro/O=org/OU=unit/CN=website"
# chmod 400 sample*.pem

This project provides a minimal jupyter_config.py configuration file that sets a few important environment variables that should be passed to child spawned processes, namely: 'PATH', 'LD_LIBRARY_PATH', 'JAVA_HOME', 'CPATH', 'CMAKE_ROOT', 'http_proxy', 'https_proxy'.

Jupyterlab

Jupyterlab is the default user interface. This behavior is set by the following line in the provided jupyterhub_config.py file:

c.Spawner.default_url = '/lab'

To revert to old Jupyter user interface, you can either access manually the /tree url (as in http://localhost:8000/user/username/tree) or edit jupyterhub_config.py deleting the c.Spawner.default_url line.

See Jupyterlab documentation for more information.

RStudio

Configuration files are at /etc/rstudio. There's also the Server Options file at /usr/lib/rstudio-server/R/ServerOptions.R.

Default port is 8787.

Change the default port by editing rserver.conf. The following will change to port 80:

# echo -e "www-port=80" | tee /etc/rstudio/rserver.conf
# rstudio-server restart
# rstudio-server verify-installation

auth-pam-sessions-profile directive on /etc/rstudio.rserver.conf may not work. If that happens, RStudio will look at /etc/pam.d/rstudio.

Proxy settings are not configured in RStudio by default. If you're running behind proxy, you should update ServerOptions.R file.

RUN echo "options(download.file.method = 'wget')" >> /usr/lib/rstudio-server/R/ServerOptions.R
RUN echo "Sys.setenv(http_proxy = 'my-proxy-url')" >> /usr/lib/rstudio-server/R/ServerOptions.R
RUN echo "Sys.setenv(https_proxy = 'my-proxy-url')" >> /usr/lib/rstudio-server/R/ServerOptions.R

Packages

Python

Users can packages with conda or pip command line.

With pip, users can install local packages for Python2 using:

$ source activate py2

$ pip install --user pkgname

And also for Python3 using:

$ source activate py3

$ pip install --user pkgname

Refer to conda documentation to install packages using conda utility.

R

Check package locations with $ R -e '.libPaths()'.

System packages will be installed at /usr/lib64/R/library.

Each user can have a local package dir, automatically created under ~/R.

root user will add packages with R -e 'install.packages("pkg-name")' command.

Julia

Since Julia v1.0, system packages are disabled. Only user-level packages are supported.

To install IJulia kernel, open a terminal and use the following commands:

julia> using Pkg

julia> pkg"add IJulia"

Restart your Jupyter session. After that, a Julia notebook option should show up.

LaTeX

The Docker image comes with a LaTeX distribution that is installed using texlive tool. TeX packages can me managed using tlmgr.

System-wide packages can be installed using:

# tlmgr install [pkgname]

Users can also install local packages. To do that, a user must initialize a ~/texmf tree:

$ tlmgr init-usertree

After that, the user can install local packages using:

$ tlmgr --usermode install [pkgname]

References