Matlab Distributed Computing

The Distributed Computing Toolbox and the MATLAB Distributed Computing Engine enable you to execute independent MATLAB operations simultaneously on a cluster of computers; speeding up execution of large MATLAB jobs. A job is a large operation that you need to perform in your MATLAB session. Jobs are broken down into segments called tasks. You decide how to divide your job into tasks. You could divide your job into identical tasks, but tasks do not have to be identical.

The MATLAB session in which the job and its tasks are defined is referred to as the client session. Often, this is on the machine where you program MATLAB. The client uses the Distributed Computing Toolbox to perform the definition of jobs and tasks. The MATLAB Distributed Computing Engine is the product that performs the execution of your job by evaluating each of its tasks and returning the result to your client session. The job manager is the part of the Computing Engine that coordinates the execution of jobs and evaluates tasks. The job manager distributes tasks to the individual MATLAB sessions, also known as workers, of the Computing Engine for evaluation.

DECS uses the MathWorks job manager and named it matlaboss. The hostname of the MathWorks job manager is elric:29100. We currently have ten workers configured. If your job will need to write out files, you will need to use the file system, /egr/scratch, to submit your job from, as the engine can only write back to this area. You can create your own directory on /egr/scratch to store your input files and results.

Creating and Running Jobs with a Job Manager

The steps of a typical programming session with Distributed Computing Toolbox using a MathWorks job manager are:

Find a Job Manager

Create a Job

Create Tasks

Submit a Job to the Job Queue

Retrieve the Job's Results

Please Note: The objects that the client session uses to interact with the job manager are only references to data that is actually contained in the job manager process, not in the client session. After jobs and tasks are created, you can close your client session and restart it; your job is still stored in the job manager. You can find existing jobs using the 'findJob' function or the Jobs property of the job manager object.


Find a Job Manager

jm = findResource('scheduler','type','jobmanager','LookupURL','elric:29100')

Use get(jm) to verify you have it.

Create a Job

Job1 = createJob(jm)


Create Tasks

After you have created your job, you can create tasks for the job using the 'createTask' function. Tasks define the functions to be evaluated by the workers when running the job. Often, the tasks of a job are identical. In this example, each task will generate a three-by-three matrix of random numbers.

createTask(Job1, @rand, 1, {3,3});

createTask(Job1, @rand, 1, {3,3});

createTask(Job1, @rand, 1, {3,3});

createTask(Job1, @rand, 1, {3,3});

createTask(Job1, @rand, 1, {3,3});

Submit a Job to the Job Queue

To run your job and have its tasks evaluated, you submit the job to the job queue with the submit function.

submit(Job1)

The job manager distributes the tasks of job1 to its registered workers for evaluation.

Retrieve the Results of the Job

Use the function 'getAllOutputArguments' to retrieve the results from all the tasks in a job.

results = getAllOutputArguments(Job1);

Additional Help

Help files for the Distributed Computing Toolbox has a section for programming tips and notes. There are also documented demos for the Distributed Computing Toolbox which can give you additional ideas about how to set up your jobs.