Skip to content
This repository has been archived by the owner on Nov 16, 2019. It is now read-only.

Create_AMI

Andy Feng edited this page Mar 6, 2016 · 5 revisions

Create CaffeOnSpark AMI on EC2

Overview

This tutorial outlines the steps to create CaffeOnSpark AMI on AWS EC2 using a g2.2xlarge (or g2.8xlarge) instance using Ubuntu 14.04.

  1. Launch a Ubuntu Server 14.04 LTS (HVM) AMI with a g2.2xlarge instance in Amazon EC2.
  1. Go to https://eu-west-1.console.aws.amazon.com/console
  2. Select EC2
  3. Request Spot Requests
  4. Specify an AMI
  5. Specify the spot max price
  6. Wait for instance to enter running state
  1. Setup EC2 key pair

Please follow [AWS instruction](http://docs.aws.amazon.com/cli/latest/userguide/cli-ec2-keypairs.html] to create a keyapir. Here is an example command):

export EC2_KEY=ec2_${USER}
export EC2_PEM_FILE=~/.ssh/ec2_${USER}.pem
ec2-create-keypair -O ${AWS_ACCESS_KEY_ID} -W ${AWS_SECRET_ACCESS_KEY} --region eu-west-1 ${EC2_KEY}
emacs ${EC2_PEM_FILE}
chmod 600 ${EC2_PEM_FILE}

The above command will create a private key file (PEM) that is accessible by you only.

  1. ssh onto your instance
ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ${EC2_PEM_FILE} root@<MASTER>
  1. Install prerequisites
sudo apt-get update && sudo apt-get upgradepushd
sudo apt-get install build-essential
sudo apt-get install gcc g++ git openjdk-7-jdk
sudo apt-get install pssh
sudo apt-get install -y libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libboost-all-dev libhdf5-serial-dev protobuf-compiler gfortran libjpeg62 libfreeimage-dev libatlas-base-dev git python-dev python-pip libgoogle-glog-dev libbz2-dev libxml2-dev libxslt-dev libffi-dev libssl-dev libgflags-dev liblmdb-dev python-yaml python-numpy maven

sudo easy_install pillow
sudo ln /dev/null /dev/raw1394
echo "export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64" >> ~/.bash_profile
source /root/.bashrc
  1. Install CUDA as explained in BVLC instruction

  2. Clone the CaffeOnSpark repo

git clone https://github.com/yahoo/CaffeOnSpark.git --recursive
pushd CaffeOnSpark/caffe-public/
cp Makefile.config.example Makefile.config
  1. Adjust Makefile.config.

You may want to uncomment the following lines.

USE_CUDNN := 1

Make JDK include files accessible:

echo "INCLUDE_DIRS += ${JAVA_HOME}/include" >> Makefile.config
  1. Build CaffeOnSpark

If you are building on a CPU node, please adjust /root/CaffeOnSpark/Makefile to use "mvn -DskipTests=true package" instead of "mvn package".

pushd ..
export CAFFE_ON_SPARK=/root/CaffeOnSpark
export LD_LIBRARY_PATH="${CAFFE_ON_SPARK}/caffe-public/distribute/lib:${CAFFE_ON_SPARK}/caffe-distri/distribute/lib:/usr/lib64:/lib64:/usr/local/cuda-7.0/lib64"
make build
  1. Adjust environment settings
echo "export CAFFE_ON_SPARK=/root/CaffeOnSpark" >> /root/.bashrc
echo "export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" >> /root/.bashrc
echo "export HADOOP_HOME=/root/ephemeral-hdfs" >> /root/.bashrc
echo "export SPARK_HOME=/root/spark" >> /root/.bashrc
echo "export PATH=${PATH}:${HADOOP_HOME}/bin:${SPARK_HOME}/bin" >> /root/.bashrc
  1. Install mnist dataset
sudo ln /dev/null /dev/raw1394
${CAFFE_ON_SPARK}/scripts/setup-mnist.sh
${CAFFE_ON_SPARK}/scripts/setup-cifar10.sh

Adjust ${CAFFE_ON_SPARK}/data/lenet_memory_train_test.prototxt, and cifar10_quick_train_test.prototxt to use absolute paths.

source: "file:///root/CaffeOnSpark/data/mnist_train_lmdb/"
source: "file:///root/CaffeOnSpark/data/mnist_test_lmdb/"
source: "file:///root/CaffeOnSpark/data/cifar10_train_lmdb/"
source: "file:///root/CaffeOnSpark/data/cifar10_test_lmdb/"
  1. Use Amazon EC2 console to create an AMI image from your instance: Actions -> Image -> Create Image.
Clone this wiki locally