Configuration File for Creating Deployable Archive Using the mcc Command

When creating a deployable archive using the mcc command, you must create a text file containing the following information:

Parameter TypeDescription

mw.ds.out.type

Output type of data from Hadoop® mapreduce job

The options are:

  • keyvalue

  • tabulartext

mw.mapper

Name of MATLAB® map function

mw.reducer

Name of MATLAB reduce function

mw.ds.in.format

Name of MAT-file containing a datastore object representing the format of the data to be processed.

In most cases, you will start off by working on a small sample dataset residing on a local machine that is representative of the actual dataset on the cluster. This sample dataset has the same structure and variables as the actual dataset on the cluster. By creating a datastore object to the dataset residing on your local machine you are taking a snapshot of that structure. By having access to this datastore object, a Hadoop job executing on the cluster will know how to access and process the actual dataset residing on HDFS™.

mw.ds.in.type

Input type of data to Hadoop mapreduce job

The options are:

  • keyvalue

  • tabulartext

mw.ds.in.fullfile

Default value is false

Sample Configuration File

config.txt

mw.ds.out.type = keyvalue
mw.mapper = maxArrivalDelayMapper
mw.reducer = maxArrivalDelayReducer
mw.ds.in.format = infoAboutDataset.mat
mw.ds.in.type = tabulartext

Related Topics