Spardat2SSD

Version 1.0.0.0 (8.99 KB) by Skynet
Convert a data file from spardat to SSD format.
1.1K Downloads
Updated 16 Apr 2007

View License

spardat2ssd(FILEIN,FILEOUT,DATATYPE,DISPINTERVAL) converts the contents of the input file FILEIN to the output file FILEOUT. FILEIN is the name of a data file in Spardat format, particularly such as that used by SVM-Light. FILEOUT is the name of a data file in Simple Sparse Dataset (SSD) format, particularly such as that used by Auton Lab.

The argument DATATYPE is optional. Its value can be either Categorical or Real, with the default being Categorical. Categorical pertains to a data file which has attribute values of only 1. Real pertains to a data file which has real attribute values, e.g. -4, 3, 3.14, etc. If set to Categorical (default), the output data file will have two columns, with the first column representing the row number (starting from 0), and the second column representing the column number (with the class being column 1). If set to Real, the output data file will also have a third column - this represents the real numbered attribute value.

The argument DISPINTERVAL is also optional. This argument controls the frequency of display of the conversion status. Its default value is 100, which means that the status will display after processing every 100 lines from the input file. Irrespective of the value of this argument, the status will also display once the input file has been fully processed. If set to 0, the status will never display.

EXAMPLES:

spardat2ssd('spardat_categorical.sample.data','ssd_categorical.sample.csv','Categorical',2)

spardat2ssd('spardat_real.sample.data','ssd_real.sample.csv','Real',0)

spardat2ssd('spardat.data','ssd.data')

spardat2ssd('spardat.data','ssd.data','Real')

REMARKS:

The input file must contain cases that are only two-class. The class value must be represented in the first column of the input file. Positive classes must be represented as 1, and negative classes must be represented as either -1 or 0.

Lines beginning with the # character in the input file are ignored as comments. Additionally, anything after the # character in any line of the input file is also ignored as a comment.

At least at the time of writing this, Auton Lab's software products do not seem to support the SSD format output file containing real numbered attribute values. This output therefore might not have any practical use.

Cases that do not have any stated feature values are processed correctly.

Very limited testing of the source code has been done. Moreover, there is a lot of room to optimize it, especially for conciseness.

[Please subscribe to this file if you use it, so you can be notified of updates.]

Cite As

Skynet (2024). Spardat2SSD (https://www.mathworks.com/matlabcentral/fileexchange/14367-spardat2ssd), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R2006b
Compatible with any release
Platform Compatibility
Windows macOS Linux
Categories
Find more on Convert Image Type in Help Center and MATLAB Answers

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes
1.0.0.0

(1) Added a remark that empty cases are processed correctly.
(2) Removed a prohibitive validity check.