Skip to content
egonina edited this page Jul 15, 2013 · 16 revisions

PyCASP Manual

PyCASP aims to provide a single software environment for productive, efficient, portable and scalable application development. PyCASP is a collection of specializers (mini-compilers) that automatically map computations onto parallel processors (NVIDIA GPUs, Intel multicore CPUS and clusters). PyCASP targets audio content analysis applications (speech and music processing for example) however, the specializers can be used for other applications (at your own risk, of course). PyCASP's specializers are built on top of the ASP framework (https://github.com/shoaibkamil/asp).

AUTHORS: Katya Gonina, Henry Cook, Shoaib Kamil

DISCLAIMER: This is research code and is a work in progress, use at your own risk!

CONTACT: egonina (at) eecs (dot) berkeley (dot) edu for questions etc.

Installing the framework

Simply check the code out of the repo, and in the base directory run

$> python setup.py install --user or $> sudo python setup.py install

The package managers should fetch ASP and all its attendant dependencies and install all of them on your machine. If you have trouble with this step, consult these directions for manual installation of ASP. You can also get ASP pre-installed on a VM Image.

However, there are some external requirements that Pythonic package managers cannot take care of on your behalf, specifically the compilers required to actually build the specialized code.

If you want to use the CUDA backend for GPUs, you must install NVIDIA's compiler (nvcc), runtime, driver and at least one GPU card. The compiler must be on your $PATH, and the runtime libraries must be on your $LD_LIBRARY_PATH. We recommend a >3.0 release of the CUDA toolkit (especially 4.1), but the specializer should work with card compute capabilities as low as 1.2.

If you want to use the Cilk+ backend for multicores, you must install Intel's compiler (icc), libraries, and the Cilk+ runtime. The compiler must be on your $PATH, and the runtime libraries must be on your $LD_LIBRARY_PATH. We recommend the 12.0.5 release of Cilk+.

Finally, all specializers built on ASP have a configuration file that contains some simple directives for each specializer. We provide an example configuration in asp_config.yml. If you already have some ASP-based specializers installed, just append this file to the existing one. Otherwise, copy it to ~/.asp_config.yml. With these settings you can control whether the Cilk or CUDA backend will be the target of specialization, which CUDA device specialized code will be run on, and whether the specializer will attempt to auto-tune itself to your particular machine and problem space (experimental).

Once you think the python dependencies, compilers, environment variables and config file are set up correctly, try

$> ./run_tests.sh

Then take a look at the sample applications provided in examples/ and read on.

Specializers

PyCASP comes with the following specializers:

  1. GMM Training and Classification (on GPUs and Intel CPUs)
  2. SVM Training and Classification (on GPUs)

PyCASP also contains several composition optimizations as well as a MapReduce module to enable running jobs on a cluster of machines. (to be continued...)

Importing PyCASP specializers

When you install PyCASP, all of PyCASP's specializers are installed in the python package directory as well as an internal package containing PyCASP's composition logic.

To import a particular specializer, in your python code use:

from gmm_specializer.gmm import * and from svm_specializer.svm import *

For usage description of each specializer see the corresponding wiki pages.

Clone this wiki locally